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OT.YPHOSATTC TOLERANT 
S-ENOLPYRUVYT^qHTKIMA TR>3,PHOSPHATE SYNTHASES 



This is a continuation-in-part of a copending U.S. 
patent application having serial number 07/576,537, filed August 
31, 1990 and entitled "Glyphosate Tolerant 
5-Enolp3rruvylshikimate-3-Phosphate Synthases." 

BACiKGRO TTND OF THE INVENTION 

This invention relates in general to plant molecular 
biology and, more partictdarly, to a new class of glyphosate 
tolerant 5-enolp3nruvylshikimate-3-phosphate synthases. 

Recent advances in genetic engineering have provided 
the requisite tools to transform plants to contain foreign genes. It 
is now possible to produce plants which have unique 
characteristics of agronomic importance. Certainly, one such 
advantageous trait is more cost efiFective, environmentally 
compatible weed control via herbicide tolerance. 
Herbicide-tolerant plants may reduce the need for tillage to control 
weeds thereby effectively reducing soil erosion. 

One herbicide which is the subject of much 
investigation in this regard is N-phosphonomethylglycine 
commonly referred to as gljnphosate. Glyphosate inhibits the 
shikimic add pathway which leads to the biosynthesis of aromatic 
compoimds including amino acids, plant hormones and vitamins. 
Specifically, glyphosate curbs the conversion of 
phosphoenolp3mivic acid (PEP) and 3-phosphoshikimic acid to 
5-enolpyTUvyl-3-phosphoshikimic acid by inhibiting the Bnzyine 
5-enolpyTuvylshikimate-3-phosphate sjmthase (hereinafter 
referred to as EPSP synthase or EPSPS). 
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It has been shown that glyphosate tolerant plants can 
be produced by inserting into the genome of the plant the capacity 
to produce a higher level of EPSP synthase in the chloroplast of the 
cell (Shah et al., 1986) which enzyme is preferably glyphosate 

5 tolerant (Kishore et al. 1988). Variants of the wild-type EPSPS 
enz3nne have been isolated which are glyphosate tolerant as a 
result of alterations in the EPSPS amino acid coding sequence 
(Kishore and Shah, 1988; Schulz et al., 1984; Sost et al., 1984; 
Kishore et al., 1986). These variants typically have a higher Ki for 

10 glyphosate than the wild-type EPSPS enzyme which confei^s the 
glyphosate tolerant phenotype, but' these variants are also 
characterized by a high Km for PEP which makes the enzyme 
kinetically less eflBcient (Kishore and Shah, 1988; Sost et al., 1984; 
Schtilz et al., 1984; Kishore et al., 1986); Sost and Amrhein, 1990). 

^ For example, the apparent Km for PEP and the apparent Ki for 
^yphosate for the native EPSPS fromS. coli are 10 nM and 0.5 ixNL 
while for a glyphosate tolerant isolate having a single amino acid 
substitution of an alanine for the glycine at position 96 these 
values are 220 jiM and 4.0 mM, respectively. A number of 
glyphosate tolerant plant variant EPSPS genes have been 
constructed by mutagenesis. Again, the glyphosate tolerant 
EPSPS was impaired due to an increase in the for PEP and a 
slight reduction of the Vmax of the native plant enzyme (Kishore 

5g and Shah, 1988) thereby lowering the catalytic efBciency 
CVxnax/Kxn) of the enzyme. Since the kinetic constants of the 
variant enzymes are impaired with respect to PEP, it has been 
proposed that high levels of overproduction of the variant enzyme^ 
40-80 fold, would be required to TnaintgiTi normal catalytic activity 

30 ^ plants in the presence of glyphosate (Kishore et al., 1988). 
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While such variant EPSP synthases have proved 
useful in obtaining transgenic plants tolerant to glyphosate, it 
would be increasingly beneficial to obtain an EPSP synthase that is 
highly glyphosate tolerant while still kinetically efiBdent such that 

5 the amount of the glyphosate tolerant EPSPS needed to be produced 
to maintain normal catalytic activity in the plant is reduced or that 
improved tolerance be obtained with the same expression level. 

Previous studies have shown that EPSPS enzymes 
from dififerent sources vary widely with respect to their degree of 

10 sensitivity to inhibition by glyphosate. A study of plant and 
bacterial EPSPS enzyme activity as a fxinction of glyphosate 
concentration showed tJiat there was a very wide range in the 
degree of sensitivity to glyphosate. The degree of sensitivity 
showed no correlation with any genus or species tested (Schulz et 

15 al., 1985). Insensitivity to glyphosate inhibition of the activity of the 
EPSPS from the Pseudomonas sp. PG2982 has also been reported 
but with no details of the studies (Fitzgibbon, 1988). In general, 
while such natural tolerance has been reported, there is no report 
suggesting the kinetic superiority of the naturally occurring 

2D bacterial glyphosate tolerant EPSPS etaynies over those of mutated 
EPSPS enzymes nor have any of the genes been characterized. 
Similarly, there are no reports on the expression of naturally 
glyphosate tolerant EPSPS enzymes in plants to confer gl3n;)hosate 
tolerance. 

25 

.CSTTMMARY OF T TTTC TNVRNTION 

A DNA molecule comprising DNA encoding a 
kinetically efficient, gljrphosate tolerant EPSP synthase is 
30 presented. The EPSP synthases of the present invention reduce 
the amount of overproduction of the EPSPS enzyme in a transgenic 
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plant necessary for the enzyme to maint>ain catalytic activity while 
still conferring glyphosate tolerance. This and other EPSP 
sjmthases described herein represent a new class of EPSPS 
enzymes, referred to hereinafter as Class II EPSPS enzymes. 

5 Class n EPSPS enzymes share little homology to known bacterial 
or plant EPSPS enzymes and exhibit tolerance to glyphosate while 
maintaining suitable (PEP) ranges. Suitable ranges of Km 
(PEP) for EPSPS for enzjrmes of the present invention are between 1- 
150. |iM, with a more preferred range of between 1-35 jiM, and a 

10 most preferred range between 2^25 ^M. These kinetic constants 
are determined under the assay conditions specified hereinafter. 
The 

Vxnax of the enzyme should preferably be at least 15% of tiie 
uninhibited plant enzyme and more preferably greater than 25%. 
An EPSPS of the present invention preferably has a Ki for 

^ glyphosate range of between 25-10000 |iM. The IQ/Km ratio should 
be between 3-500» and more preferably between 6-250. The Vmax 
should preferably be in the range of 2-100 txnits/mg 
(|imole&/minute.mg at 25''C) and the for shikimate-3-phosphate 

20 should preferably be in the range of 0.1 to 50 ^M. 

Genes coding for Class n EPSPS enzymes have been 
isolated from three (3) different bacteria: A^ro&ac^erzum 
tumefaciens sp. strain CP4, Achromobacter sp. strain LBAA, and 
Pseudomonas sp. strain PG2982. The LBAA and PG2982 Class n 

25 EPSPS genes have been determined to be identical and the proteins 
encoded by these two genes are very similar to the CP4 protein and 
share approximately 84% amino acid identity with it. Class 11 
EPSPS enzymes can be readily distinguished from CIhbs I EPSPS's 
by their inability to react with polyclonal antibodies prepared from 

3Q Class I EPSPS enzymes under conditions where other Class I 
EPSPS enzymes would readily react with the Class I antibodies. 
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Other Class n EPSPS enzymes can be readily isolated 
and identified by utilizing a nucleic acid probe firom one of the 
Class II EPSPS genes disclosed herein using standard 
hybridization techniques. Such a probe from the CP4 strain has 

5 been prepared and utilized to isolate the Class II EPSPS genes 
fi-om strains LBAA and PG2982. These genes may also be adapted 
for enhanced expression in plants by known methodology. Such a 
probe has also been used to identify homologous genes in bacteria 
isolated de novo £rom soil. 

10 The Class n EPSPS enzymes are preferably fused to a 

' chloroplast transit peptide (CTP) to target the protein to the 
chloroplasts of the plant into which it may be introduced. 
Chimeric genes encoding this CTP-Class 11 EPSPS fusion protein 
may be prepared with an appropriate promoter and 3' 

15 polyadenylation site for introduction into a desired plant by 
standard methods. 

Therefore, in one aspect, the present invention 
provides a new class of EPSP synthases that exhibit a low Km for 
phosphoenolpyruvate (PEP), a high Vmax/Km ratio, and a high Ki 

20 £qj. giyphosate such that when introduced into a plant, the plant is 
made giyphosate tolerant such that the catalytic activity of the 
enzyme and plant metabolism are maintsdned in a substantially 
normal state. For ptirposes of this discussion, a highly efficient 
EPSPS refers to its effidenpy in the presence of giyphosate. 

25 In another aspect of the present invention, a 

double-stranded DNA molecule comprising DNA encoding a 
Class n EPSPS enzyme is disclosed. A Class n EPSPS enzyme 
DNA sequence is disclosed firom three sources: Agrobacterium sp. 
strain designated CP4, Achromobacter sp. strain LBAA and 

^ Pseudomonas sp. strain PG2982. 
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In a further aspect of the present invention, a nucleic 
acid probe from an EPSPS Class II gene is presented that is 
suitable for use in screening for Class 11 EPSPS genes in other 
sources by assaying for the ability of a DNA sequence from the 

5 other source to hybridize to the probe. 

In yet another aspect of the present invention, 
transgenic plants and transformed plant cells are disclosed tiiat 
are made glyphosate tolerant by the introduction of a Class II 
EPSPS gene into the plant's genome. 

10 i In a still further aspect of the invention, a 

recombinant, double-stranded cDNA molecule comprising in 
sequence: 

a) a promoter which functions in plant cells to cause 
the production of an RNA sequence; 
15 b) a structural DNA sequence that causes the 

production of an RNA sequence which encodes a 
Class n EPSPS enzyme; and 
c) a 3* nontranslated region which functions in plant 
cells to cause the addition of a stretch of polyadenyl 
20 nucleotides to the 3* end of the RNA sequence 

where the promoter is heterologous with respect to the structural 
DNA sequence and adapted to cause sufficient expression of the 
fusion polypeptide to enhance the glyphosate tolerance of a plant 
cell transformed with said DNA molecule. 
25 In still another aspect of the present invention, a 

method for selectively controlling weeds in a crop field is presented 
by planting crop seeds or crop plants transformed with a Class 11 
EPSPS gene to confer glyphosate tolerance to the plants which 
allows for glyphosate containing herbicides to be applied to the 
30 crop to selectively kill the gl3T>hosate sensitive weeds, but not the 
crops. 



wo 92/04449 



PCr/US91/06148 



-7- 

Other and further objects, advantages and aspects of 
the invention will become apparent from the accompanying 
drawing figures and the description of the invention. 

5 T^T^TTCF DTCSnRTPTTOTJ OF THTC T>KAWINGS 

Figure 1 shows the DNA sequence (SEQ ID NO:l) for 
the fioll-length promoter of figwort mosaic virus (FMV35S). 

Figure 2 shows the cosmid cloning vector pMON17020. 
. 10 Figure 3 shows the structural DNA sequence (SEQ ID 

NO:2) for the Class II EPSPS gene from bacterial isolate 
Agrobacterium sp. strain CP4 and the deduced amino add 
sequence (SEQ ID NO:3). 

Figure 4 shows the structural DNA sequence (SEQ ID 
15 NO:4) for the Class II EPSPS gene from the bacterial isolate 
Achromobacter sp. strain LBAA and the deduced amino acid 
sequence (SEQ ID NO:5). 

Figure 5 shows the structural DNA sequence (SEQ ID 
NO:6) for the Class II EPSPS gene from the bacterial isolate 
20 Pseudomonas sp. strain PG2982 and the deduced amino acid 
sequence (SEQID NO:?). 

Figure 6 shows the Bestfit comparison of the E. coli 
EPSPS amino acid sequence (SEQ ID NO:8) with that for the CP4 

EPSPS (SEQIDN0:3X 
25 Figure 7 shows the Bestfit comparison of the CP4 

EPSPS amino acid sequence (SEQ ID NO:3) with that for the 
LBAA EPSPS (SEQIDNO:5). 

Figure 8 shows the structural DNA sequence (SEQ ID 
NO:9) for the synthetic CP4 Class n EPSPS gene. 
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Figure 9 shows the DNA sequence (SEQ ID NO:10) of 
the chloroplast transit peptide (CTP) and encoded amino acid 
sequence (SEQ ID N0:11) derived from the Arabidopsis thaliana 
EPSPS CTP and containing a Sphl restriction site at the 
5 chloroplast processing site, hereinafter referred to as CTP2. 

Figure 10 shows the DNA sequence (SEQ ID NO:12) of 
the chloroplast transit peptide and encoded amino acid sequence 
(SEQ ID NO:13) derived from the Arabidopsis thaliana EPSPS gene 
and containing an EcoBI restriction site within the mature region 
ID of the EPSPS, hereinafter referred to as CTP3. * 

Figure 11 shows the DNA sequence (SEQ ID NO: 14) of 
the chloroplast transit peptide and encoded amino acid sequence 
(SEQ ID NO:15) derived from the Petunia hybrida EPSPS CTP and 
containing a Sphl restriction site at the chloroplast processing site 
15 and in which the amino adds at the processing site are changed to 
-Cys-Met-, hereinafter referred to as CTP4. 

Figure 12 shows the DNA sequence (SEQ ID NO:16) of 
the chloroplast transit peptide and encoded amino add sequence 
(SEQ ID NO:17) derived from the Petunia hybrida EPSPS gene with 
20 the naturally occurring EcpBl site in the matxire region of the 
EPSPS gene, hereinafter referred to as CTP5. 

Figure 13 shows a plasmid map of CP4 plant 
transformation/ expression vector pMONlTllO. 

Figure 14 shows a plasmid map of CP4 synthetic 
25 EPSPS gene plant transformation/expression vector pMON17131. 

Figure 15 shows a plasmid map of CP4 EPSPS free 
DNA plant transformation expression vector pMON13640. 

Figure 16 shows a plasmid map of CP4 plant 
transformation/direct selection vector pMON17227. 
30 Figure 17 shows a plasmid map of CP4 plant 

transformation/expression vector pMON19653. 
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, <yr^rF.MTCTsrT O F ^r^- ttcvrntion 

The expression of a plant gene which exists in 
double-stranded DNA form involves synthesis of messenger RNA 
5 (mRNA) from one strand of the DNA by RNA polymerase enzyme, 
and the subsequent processing of the mRNA primary transcript 
inside the nucleus. This processing involves a 3' non-translated 
region which adds polyadenylate nucleotides to the 3' end of the 
RNA. 

^10 Transcription of DNA into mRNA is regulated by a 

^ region of DNA usually referred to as the "promoter i" The 
promoter region contains a sequence of bases that signals RNA 
polymerase to associate with the DNA, and to initiate the 
transcription into mRNA using one of the DNA strands as a 

15 template to make a corresponding complementary strand of RNA. 

A number of promoters which are active in plant cells 
have been described in the literature. These include the nopaline 
synthase (NOS) and octopine synthase (OCS) promoters (which are 
carried on tumor-inducing plasmids of Agrobacterium 

20 tumefaciens), the cauliflower mosaic virus (CaMV) 19S and 35S 
promoters, the light-indudble promoter from the small subunit of 
ribulose bis-phosphate carboxylase (ssRUBISCO, a very abundant 
plant polypeptide) and the ftdl-length transcript promoter from the 
figwort mosaic virus (FMV35S). All of these promoters have been 

25 used to create various types of DNA constructs which have been 
expressed in plants; see, e.g., PCT publication WO 84/02913 
(Rogers et al., Monsanto). 

Promoters which are known or are found to cause 
transcription of DNA in plant cells can be used in the present 

30 invention. Such promoters may be obtained from a variety of 
sources such as plants and plant DNA viruses and include, but 
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are not limited to, the CaMV35S and FMV35S promoters and 
promoters isolated from plant genes such as ssRUBISCO genes. 
As described below, it is preferred that the particular promoter 
selected should be capable of causing sufficient expression to 

5 result in the production of an eflTective amount of a Class n EPSPS 
to render the plant substantially tolerant to ^yphosate herbicides. 
The amount of Class II EPSPS needed to induce the desired 
tolerance may vary with the plant species. It is preferred that the 
promoters utilized have relatively high expression in all 

10 meristematic tissues in addition to other tissues inasmuch as it is 
now known that glyphosate is translocated and accumtdated iii 
this type of plant tissue. Alternatively, a combination of chimeric 
genes can be used to cumidatively result in the necessary overall 
expression level of the selected Class 11 EPSPS enzyme to result in 

15 the glyphosate tolerant phenotype. 

The mRNA produced by a DNA construct of the 
present invention also contains a 5* non- translated leader 
sequence. This sequence can be derived from the promoter 
selected to express the gene, and can be specifically modified so as 

20 to increase translation of the mRNA. The 5' non-translated 
regions can: also be obtained from viral RNAs, from suitable 
eukaryotic genes, or from a synthetic gene sequence. The present 
invention is not limited to constructs, as presented in the following 
examples, wherein the non-translated region is derived from both 

25 the 5' non-translated sequence that accompanies the promoter 
sequence and part of the 5' non-translated region of the virus coat 
protein gene. Rather, the non-translated leader sequence can be 
derived from an imrelated promoter or coding sequence as 
discussed above. 

30 A preferred promoter for use in the present invention 

is the full-length transcript (SEQ ID NO:l) promoter from the 
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figwort mosaic virus (FMV35S) which functions as a strong and 
uniform promoter with particularly good expression in 
meristematic tissue for chimeric genes inserted into plants, 
particularly dicotyledons. The resulting transgenic plant in 

5 general expresses the protein encoded by the inserted gene at a 
higher and more taniform level throughout the tissues and cells of 
the transformed plant than the same gene driven by an enhanced 
CaMV35S promoter. Referring to Figure 1, the DNA sequence 
(SEQ ID NO:l) of the FMV35S promoter is located between 

10 nucleotides 6368 and 6930 of the FMV genome. A 5* non-translated 
leader sequence is preferably coupled with the promoter. The 
leader sequence can be from the FMV35S genome itself or can be 
from a source other than FMV35S. 

The 3' non-translated region of the chimeric plant 

15 gene contains a polyadenylation signal which functions in plants 
to cause the addition of polyadenylate nucleotides to the 3* end of 
the viral RNA. Examples of suitable 3' regions are (1) the 
3' transcribed, non-translated regions containing the 
polyadenylated signal of Agrobacterium tumor-inducing (Ti) 

20 plasmid genes, such as the nopaline ssmthase (NOS) gene, and (2) 
plant genes like the soybean storagie protein genes and the small 
subimit of the ribulose-l,5-bisphosphate carboxylase (ssRUBISCO) 
gene. An example of a preferred 3* region is that from the 
ssRUBISCO gene from pea (E9), described in greater detail below. 

25 The DNA constructs of the present invention also 

contain a structural coding sequence in double-stranded DNA 
form which encodes a glyphosate tolerant^ highly efficient Class n 
EPSPS enzyme. 



30 
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Tdentification of glvphnsate tolerant, highly efficient 
EPSPS enzymes 

In an attempt to identify and isolate glyphosate 
tolerant, highly efficient EPSPS enzymes, kinetic analysis of the 

5 EPSPS enzymes firom a number of bacteria exhibiting tolerance to 
glyphosate or that had been isolated from suitable sources was 
undertaken. It was discoyered that in some cases the EPSPS 
enzymes showed no tolerance to inhibition by glyphosate and it 
was concluded that the tolerance phenotype of the bacterium was 

10 due to an impermeability to glyphosate or other factors. In a 
number of cases, howeyer, microorganisms were identified whosle^ 
EPSPS enzyme showed a greater degree of tolerance to inhibition 
by glyphosate and that displayed a low Km for PEP when compared 
to that preyiously reported for other microbial and plant sources. 

15 The EPSPS enzymes from these microorganisms were then 
subjected to further study and analysis. 

Table I displays the data obtained for the EPSPS 
enzymes identified and isolated as a result of the aboye described 
analysis. Table I includes data for three identified Class n EPSPS 

20 enzymes that were obseryed to haye a high tolerance to inhibition 
to glyphosate and a low Km for PEP as well as data for the iiatiye 
Petunia EPSPS and a glyphosate tolerant yariant of the Petunia 
EPSPS referred to as GAIOI. The GAlOl yariant is so named 
because it exhibits the substitution of an alanine residue for a 

25 glycine residue at position 101 (with respect to Petunia) in the 
inyariant region. When the change introduced into the Petunia 
EPSPS (GAlOl) was introduced into a number of other EPSPS 
enzymes, similar changes in kinetics were obseryed, an eleyation 
of the for glyphosate and of the Km for PEP. 

30 
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ENZYME Kn,PEP iq Glyphosate IQ/Kn, 

SOURCE (»aM) (jiM) 

Petunia 5 0.4 0.08 

Petunia GAlOl 200 2000 ID 

PG2982 2.1-3.11 25^2 -8-40 

LBAA -7.3-82 60 (est) -7.9 

CP4 123 2720 >r 227 



1 Range of PEP tested = 1-40 pM 

2 Range of PEP tested = 5-80 jiM 

3 Range of PEP tested = 1.5-40 pM 

The Agrobacterium sp. strain CP4 was initially 
identified by its ability to grow on glyphosate as a carbon source (10 
mM) in the presence of 1 mM phosphate. The strain CP4 was 
identified fi-om a coUection obtained firom a fixed-bed immobiHzed 
cell column that employed MannviUe R-635 diatomaceous earth 
beads. The column had been run for three months on a 
waste-water feed from a glyphosate production plant. The column 
contained 50 mg/ml glyphosate and NH3 as NH4CI. Total organic 
carbon was 300 mg/ml and BOD's (Biological Oxygen Demand - a 
measure of "soft" carbon availability) were less than 30 mg/ml. 
This treatment column has been described (Heitkamp et al., 1990). 
Dworkin-Foster minimal salts medium containing glyphosate at 
10 mM and with phosphate at 1 mM was used to select for 
microbes fi-om a wash of this column that were capable of growing 
on glyphosate as sole carbon source. Dworkin-Foster minimal 
medium was made up by combining in 1 liter (with autoclaved 
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H2O), 1 ml each of A, B and C and 10 ml of D (as per below) and 
thiamine HCl (5 mg). 

A . D-F Salts (lOOOX stock; per 100 ml; autodaved): 
H3BO3 1 
MnS04.7H20 1 mg 
ZnS04.7H20 12.5 mg 
CUSO4.5H2O 8 mg 
NaMo03.3H20,. 17 mg 

B. PeSO4.7H20 (lOOOX stock; per 100 ml; autodaved) 

0.1 g 

15 C . MgS04.7H20 (lOOOX stock; per 100 ml; autodaved) 

20 g 

D. (NH4)2S04 (lOOX stock; per 100 ml; autodaved) 

20 g 

20 

Yeast Extract (YE; Difco) was added to a final 
concentration of 0.01 or 0.001%. The strain CP4 was also grown on 
media composed of D-F salts, amended as described above, 
containing glucose, gluconate and citrate (each at 0.1 %) as carbon 

^ sources and with inorganic phosphate (0.2 - 1.0 mM) as the 
phosphorous source. 

Other Class II EPSPS containing microorganisms 
were identified as Achromobacter sp. strain LBAA, which was 
from a collection of bacteria previously described (Hallas et al., 

^ 1988), and Pseudomonas sp. strain PG2982 which has been 
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described in the Uterature (Moore et al. 1983; Fitzgibbon 1988). It 
had been reported previously, from measurements in crude 
lysates, that the EPSPS enzyme from strain PG2982 was less 
sensitive to inhibition to glyphosate than that of E. coli, but there 
5 has been no report of the detaUs of this lack of sensitivity and there 
has been no report on the K„ for PEP for this enzyme or of the DNA 
sequence for the gene for this enzyme (Fitzgibbon, 1988; Fitzgibbon 
and Braymer, 1990). 

10 T?^latinnshin p f mass H F.P.SP.S t,0 thoSC PTPYiOUfflY fftttdigd 

All EPSPS proteins studied to date have shown a 
remarkable degree of homology. For example, bacterial and plant 
EPSPS's are about 54% identical and with similarity as high as 
80%. Within bacterial EPSPS's and plant EPSPS's themselves the 
15 degree of identity and similarity is much greater (see Table H). 



Table n r^nnpnrisop Tw><wAm expmrilttrv ClaSS I EPSPS 

E. coli vs. S. typhimurium 93.0 88.3 

P. hybrida vs. E. coli 71.9 54.5 



25 



hybrida vs. Tomato 92.8 88.2 

The EPSPS sequences compared here were obtained from the foUowing 
references: E. coli, Rogers et al., 1983; S. typhimurium. Stalker et al.. 1985; 
Petunia hybrida. Shah et al., 1986; and Tomato, Gasser et al., 1988. 
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When crude extracts of CJP4 and LBAA bacteria (50 \ig 
protein) were probed using rabbit anti-EPSPS antibody (Padgette et 
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al. 1987) to the Petunia EPSPS protein in a Western analysis, no 
positive signal could be detected, even with extended exposure 
times (Protein A - 1251 development system) and under conditions 
where the control EPSPS (Petunia EPSPS. 20 ng; a Qass I EPSPS) 

5 was readily detected. The presence of EPSPS activity in these 
extracts was confirmed by enzyme assay. This surprising result, 
indicating a lack of similarity between the EPSPS's from these 
bacterial isolates and those previously studied, coupled with the 
combination of a low Km for PEP and a high Ki for glyphosate, 

10 illustrates that these new EPSPS enzymes are different, firom 
jmown EPSPS enzymes (now referred to as Class I EPSPS). ' - 

aivphosate Tn1f.raTit ^^^.r^^r^^nn(^f^ in Microbial ISOlateS 

For clarity and brevity of disclosure, the following 
15 description of the isolation of genes encoding Class n EPSPS 
enzymes is directed to the isolation of such a gene from a bacterial 
isolate. Those skilled in the art will recognize that the same or 
similar strategy can be utilized to isolate such genes firom other 
microbial isolates, plant or fungal sources. 

20 

ninnintr of tbp Airmhacter ium. rp. strain CP4 EPSPS Gene(s) m 
E. coli 

Having established the existence of a suitable EPSPS 
iaAgrobacterium sp. strain CP4, two parallel approaches were 

25 undertaken to clone the gene: cloning based on the expected 
phenotype for a glyphosate tolerant EPSPS; and purification of the 
exizyme to provide material to raise antibodies and to obtain amino 
acid sequences firom the protein to facilitate the verification of 
clones. Cloning and genetic techniques, unless otherwise 

30 indicated, are generally those described in Manialis et al., 1982 or 
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Sambrook et al., 1987. The cloning strategy was as follows: 
introduction of a cosmid bank of strain Agrobacterium sp. strain 
CP4 into E. coli and selection for the EPSPS gene by selection for 
growth on inhibitory concentrations of glyphosate. 

Chromosomal DNA was prepared from strain 
Agrobacterium sp. strain CP4 as follows: The cell pellet from a 
200 ml L-Broth (Miller, 1972), late log phase culture of 
Agrobacterium sp. strain CP4 was resuspended in 10 ml of 
Solution I; 50 mM Glucose, 10 mM EDTA, 25 mM Tris -CL pH 8.0 
(B^nboim and Doly, 1979). SDS was added to a final concentration 
of 1% and the suspension was subjected to three freeze-thaw 
cycles, each consisting of immersion in dry ice for 15 minutes and 
in water at 70*'C for 10 minutes. The lysate was then extracted 
four times with equal volumes of phenol:chloroform (1:1; phenol 
saturated with TE; TE = 10 mM Tris pHS.O; 1.0 mM EDTA) and the 
phases separated by centrifugation (15000g; 10 minutes). The 
ethanol-predpitable material was pelleted from the supernatant by 
brief centrifugation (8000g; 5 minutes) following addition of two 
volumes of ethanol. The pellet was resuspended in 5 ml TE and 
20 dialyzed for 16 hours at 4°C against 2 Uters TE. This preparation 
yielded a 5 ml DNA solution of 552 ng/ml. 

Partially-restricted DNA was prepared as follows. 
Three 100 ng aUquot samples of CP4 DNA were treated for 1 hour 
at 37*'C with restriction endonuclease Hindm at rates of 4, 2 and 1 
25 enzyme unit/ng DNA, respectively. The DNA samples were - 
pooled, made 0.25 mM with EDTA and extracted with an equal 
volume of phenolzchloroform. Following the addition of sodium 
acetate and ethanol, the DNA was precipitated with two volumes of 
ethanol and pelleted by centrifugation (12000 g; 10 minutes). The 
dried DNA pellet was resuspended in 500 nl TE and layered on a 
10-40% Sucrose gradient (in 5% increments of 5.5 ml each) in 0.5 M 
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NaCl, 50 mM Tris pHB.O, 5 mM EDTA. FoUowing centrifugation 
for 20 hours at 26,000 rpm in a SW28 rotor, the tubes were 
punctured and -1.5 ml fractions collected. Samples (20 pi) of each 
second fraction were run on 0.7% agarose gel and the size of the 
5 DNA determined by comparison with linearized lambda DNA and 
fTi/idin-digested lambda DNA standards. Fractions containing 
DNA of 25-35 kb fragments were pooled, desalted on AMICONIO 
coltimns (7000 rpm; 20*'C; 45 minutes) and concentrated by 
precipitation. This procedure yielded 15 ^lg of CP4 DNA of the 

10 required size. A cosmid bank was constructed using the vector 
pMON17020. This vector, a map of which is presented in Figure 2, 
is based on the pBR327 replicon and contains the 
spectinomycin/streptomycin (Spr;spc) resistance gene from Tn7 
(Fling et al., 1985), the chloramphenicol resistance gene (CmT;caf) 

15 from Tn9 (Alton et al.. 1979), the genelO promoter region frt)m 
phage T7 (Dunn et al., 1983), and the 1.6 kb B^ZII phage lambda 
cos fragment fi^m pHC79 (Hohn and Collins, 1980). A niunber of 
cloning sites are located downstream of the cat gene. Since the 
predominant block to the expression of genes from other microbial 

20 sotirces in E. coli appears to be at the level of transcription, the use 
of the T7 promoter and supplying tiie T7 polymerase in trans from 
the pGPl-2 plasmid (Tabor and Richardson, 1985), enables the 
expression of large DNA segments of foreign DNA, even those 
containing RNA polymerase transcription termination sequences. 

25 The expression of the spc gene is impaired by transcription €rom 
the T7 promoter such that only Cmr can be selected in strains 
containing pGPl-2. The use of antibiotic resistances such as Cm 
resistance which do not employ a membrane component is 
preferred due to the observation that high level expression of 

30 resistance genes that involve a membrane component, i.e. 
B-lactamase and Amp resistance, give rise to a glyphosate tolerant 
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phenotype. Presumably, this is due to the exdusion of glyphosate 
from the cell by the membrane localized resistance protein. It is 

also preferred that the selectable marker be oriented in the same 

direction as the T7 promoter. 
5 The vector was then cut with Hindm and treated 

with calf alkaline phosphatase (CAP) in preparation for cloning. 
Vector and target sequences were ligated by combining the 
following: 

10 Vector DNA (ffindni/CAP) 3 jig 

Siz^ fractionated CP4 HmdHI fragments i:5 

lOX Kgation buffer ^ 
T4 DNA Hgase (New England Biolabs) (400 U/pl) 1-0 pi 

15 and adding H2O to 22.0 pi. This mixture was incubated for 18 
hours at 16»C. lOX Ugation buffer is 250 mM Tris-HCl. pH 8.0; 100 
mM MgCb; 100 mM Dithiothreitol; 2 mM Spermidine. The ligated 
DNA (5 vd) was packaged into lambda phage particles (Stratagene; 
Gigapack Gold) using the manufacturer's procedure. 

20 A sample (200 of E. coli HBlOl (Boyer Mid 

Roliand-Dussoix, 1973) containing the T7 polymerase expression 
plasmid pGPl-2 (Tabor and Richardson. 1985) and grown 
overnight in L-Broth (with maltose at 0.2% and kanamydn at 50 
Jig/ml) was infected with 50 ^1 of the packaged DNA. 

25 Transformants were selected at SOOC on M9 (Miller, 1972) agar 
containing kanamydn (50 jig/ml). chloramphenicol (25 jig/ml), 
L-proline (50 ^lg/ml). L-leudne (50 ^g/ml) and Bl (5 ng/ml). and 
with glyphosate at 3.0 mM. AKquot samples were also plated on 
the same media laddng glyphosate to titer the padtaged cosmids. 

30 Cosmid transformants were isolated on this latter medium at a 
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rate of -5 x 105 per ng CP4 HmdlH DNA after 3 days at 30 C 
Colonies arose on the glyphosate agar from day 3 untQ day 15 mth 
a final rate of -1 per 200 cosmids. DNA was prepared from 14 
glyphosate tolerant clones and, foUowmg verification of this 
5 phenot^e, was transformed into E. coli GBlOO/pGPl-2 (E.coK 
GBIOO is an aroA derivative of MM294 [Talmadge and Gilbert, 
19801) and tested for complementation for growth in the absence of 
added aromatic amino acids and aminobenzoic acids. Other aroA 
strains such as SR481 (Bachman et al. 1980; Padgette et al., 1987). 
ID could be used and would be suitable for this experiment. The use 
of GBIOO is merely exemplary and should not be viewed in a 
limiting sense. This aroA strain usually requires that growth 
media be supplemented with L-phenylalanine. L-tyrosine and 
L-tryptophan each at 100 ng/ml and with para-hydroxybenzoic 
15 acid, 2,3-dihydroxybenzoic acid and para-aminobenzoic add each 
at 5 ^g/ml for growth in minimal media. Of the fourteen cosmids 
tested only one showed complementation of the aroA- phenotype. 
Transformants of this cosmid. pMON17076, showed weak but 
uniform growth on the unsupplemented minimal media after 10 
20 days. 

The proteins encoded by the cosmids were determined 
in vivo using a T7 expression system (Tabor and Richardson, 
1985). Cultures of E. coli containing pGPl-2 (Tabor and 
Richardson. 1985) and test and control cosmids were grown at 

25 30*»C in L-broth (2 ml) with chloramphenicol and kanamydn (25 
and 50 ng/ml. respedively) to a Klett reading of - 50. An aUquot 
was removed and the cells collected by centrifugation, washed 
with M9 salts (Miller, 1972) and resuspended in 1 ml M9 medium 
containing glucose at 0.2%, thiamine at 20 ^lg/ml and containing 

30 the 18 amino adds at 0.01% (minus cysteine and methionine). 
Following incubation at 30«C for 90 minutes, the cultures were 
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transferred to a 42*C water bath and held there for 15 minutes. 
Rifampicin (Sigma) was added to 200 ng/ml and the cultures held 
at 42*0 for 10 additional minutes and then transferred to 30*C for 
20 minutes. Samples were piilsed with 10 ^iCi of ssS-methionine 

5 for 5 minutes at 30*»C. The cells were collected by centrifugation 
and suspended in 60-120 jd cracking buffer (60 mM Tris-HCl 6.8. 
1% SDS, 1% 2-mercaptoethanol, 10% glycerol, 0.01% bromophenol 
blue). Aliquot samples were electrophoresed on 12.5% SDS-PAGE 
and following soaking for 60 minutes in 10 volumes of Acetic 

10 Acid-Methanol-water (10:30:60), the gel was soaked in 
ENLIGHTNING ™ (DUPONT) following manufacturer's 
directions, dried, and exposed at -TO^C to X-Ray fihn. Proteins of 
about 45 kd in size, labeled with 35S-methionine, were detected in 
mmiber of the cosmids, including pMON17076. 

15 

|>»TnfinatinTi of EPSP S frnm AffmhncteHum RP. Strain CP4 

All protein purification procedures were carried out 
at 3-5**C. EPSPS enzyme assays were performed using either the 
phosphate release or radioactive HPLC method, as previously 

20 described in Padgette et al. 1987, using 1 mM phosphoenol 
P3mivate (PEP, Boehringer) and 2 mM shikimate-3-phosphate 
(S3P) substrate concentrations. For radioactive HPLC assays. 
14C-PEP (Amersham) was utilized. S3P was synthesized as 
previously described in Wibbenmeyer et al. 1988. N-terminal 

25 amino acid sequencing was performed by loading samples onto a 
Polybrene precycled filter in aliquots while drying. Automated 
Edman degradation chemistry was used to determine the 
N-terminal protein sequence, using an Applied Biosjrtems Model 
470A gas phase sequencer (Hunkapiller et al. 1983) with an 

30 Applied Biosystems 120A PTH analyzer. 
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Five 10-litre fermentations were carried out on a 
spontaneous "smooth" isolate of strain CP4 that displayed less 
dumping when grown in Uquid culture. This reduced clumping 
and smooth colony morphology may be due to reduced 
5 polysaccharide production by this isolate. In the following section 
dealing with the purification of the EPSPS enzyme, CP4 refers to 
the "smooth" isolate - CP4-S1. The cells from the three batches 
showing the highest specific activities were pooled. Cell paste of 
Agrobactenum sp. CP4 (300 g) was washed twice with 0.5 L of 0.9% 
ID saline and collected by centrifugation (30 minutes, 8000 rpm in a 
GS3 Sorvall rotor). The cell pellet was suspended in 0.9 $j\ 
extraction buffer (100 mM TrisCl, 1 mM EDTA, 1 mM BAM 
(Benzamidine), 5 xnM DTT, 10% glycerol, pH 7.5) and lysed by 2 
passes through a Manton GauUn cell. The resulting solution was 
15 centrifuged (30 minutes, 8000 rpm) and the supernatant was 
treated with 0.21 L of 1.5% protamine sulfate (in 100 mM TrisCl, 
pH 7.5, 0.2% w/v final protamine sulfate concentration). After 
stirring for 1 hour, the mixture was centrifiiged (50 minutes, 8000 
rpm) and the resulting supernatant treated witii solid ammonium 
20 sulfate to 40% saturation and stirred for 1 hour. After 
. . centrifiigation (50 minutes, 8000 rpm), the resulting supernatant 
was treated with solid ammoniimi sulfate to 70% saturation, 
stirred for 50 minutes, and the insoluble protein was collected by 
centrifiigation (1 hour, 8000 rpm). This 40-70% ammonium sulfate 
25 fraction was then dissolved in extraction buffer to give a final 
volume of 0.2 L, and dialyzed twice (Spectrum 10,000 MW cutoff 
dialysis tubing) against 2 L of extraction buffer for a total of 12 
hours. 

To the resulting dialyzed 40-70% ammonium sulfate 
30 fi-action (0.29 L) was added solid ammonium siilfate to give a final 
concentration of 1 M. This material was loaded (2 ml/min) onto a 
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column (5 cm X 15 cm, 295 ml) packed with phenyl Sepharose 
CL-4B (Pharmacia) resin equilibrated with extraction buffer 
containing 1 M ammonium sulfate, and washed with the same 
buffer (1.5 L, 2 ml/min). EPSPS was eluted with a linear gradient 

5 of extraction buffer going from 1 M to 0.00 M ammonium sulfate 
(total volume of 1.5 L, 2 ml/min). Fractions were collected (20 ml) 
and assayed for EPSPS activity by the phosphate release assay. 
The fractions with the highest EPSPS activity (fractions 36-50) were 
pooled and dialyzed against 3 x 2 L (18 hours) of 10 mM TrisCa, 25 
i 10 mM Ka, 1 mM EDTA, 5 mM DTT, 10% glycerol, pH 7.8. 

The dialyzed EPSPS extract (350 ml) was loaded (5 
ml/min) onto a column (2.4 cm x 30 cm, 136 ml) packed with 
Q-Sepharose Fast Flow (Pharmacia) resin equilibrated with 10 
mM TrisCl, 25 mM KCl, 5 mM DTT, 10% glycerol, pH 7.8 (Q 

15 Sepharose buffer), and washed with 1 L of the same buffer. EPSPS 
was eluted with a linear gradient of Q Sepharose buffer going from 
0.025 M to 0.40 M KCl (total volume of 1.4 L, 5 ml/min). Fractions 
were collected (15 ml) and assayed for EPSPS activity by the 
phosphate release assay. The fractions with the highest EPSPS 

20 activity (fractions 47-60) were pooled and the protein was 
precipitated by adding solid ammonium sulfate to 80% saturation 
and stirring for 1 hour. The precipitated protein was collected by 
centrifugation (20 minutes, 12000 rpm in a GSA Sorvall rotor), 
dissolved in Q Sepharose buffer (total volume of 14 ml), and 

25 dialyzed against the same buffer (2 x 1 L, 18 hours). 

The resulting dialyzed partially purified EPSPS 
extract (19 ml) was loaded (1.7 ml/min) onto a Mono Q 10/10 
column (Pharmacia) equilibrated with Q Sepharose buffer, and 
washed with the same buffer (35 ml). EPSPS was eluted with a 

30 Knear gradient of 0.025 M to 0.35 M KCl (total volume of 119 ml, 1.7 
ml/min). Fractions were collected (1.7 ml) and assayed for EPSPS 
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activity by the phosphate release assay. The fractions with the 

highest EPSPS activity (fractions 30-37) were pooled (6 ml). , 

The Mono Q pool was made 1 M in ammonium sulfiate 
by the addition of solid ammonium sulfate and 2 ml aHquots were ^ 
5 chromatographed on a Phenyl Superose 5/5 column (Pharmacia) 
equiHbrated with 100 mM TrisCl, 5 mM DTT. 1 M ammonium 
sulfete, 10% glycerol, pH 7.5 (Phenyl Superose buffer). Samples 
were loaded (1 ml/min), washed with Phenyl Superose buffer (10 
ml), and eluted with a linear gradient of Phenyl Superose buffer ' 
ID going from 1 M to 0.00 M ammonium sulfate (total volume of 60 «^ 
ml, 1 ml/min). Fractions were collected (1 ml) and assayed for 
EPSPS activity by the phosphate release assay. The fractions from 
each run with the highest EPSPS activity (fractions -36-40) were 
pooled together (10 ml, 2.5 mg protein). For N-terminal amino 
15 acid sequence determination, a portion of one fraction (#39 from 
run 1) was dialyzed against 50 mM NaHCOa (2x1 L). The 
resulting pure EPSPS sample (0.9 ml, 77 \tg protein) was found to 
exhibit a single N-terminal amino acid sequence of: 
XH(G)ASSRPATARKSS(G)LX(GXT)V(R)IPG(D)(KXM) (SEQ ID NO:18). 
20 In this and all amino add sequences to foUow, the 

standard single letter nomenclature is used. All peptide 
structures represented in the following description are shown in 
conventional format wherein the amino group at the N-terminus 
appears to the left and the carboxyl group at the C-terminus at the 
25 right. Likewise, amino acid nomenclature for the naturally 
occurring amino acids found in protein is as follows: alanine 
(Ala;A), asparagine (Asn;N), aspartic acid (Asp;D), arginine 
(Arg;R), cysteine (Cys;C), glutamic acid (Glu;E), glutamine 
(Gln;Q), glycine (Gly;G), histidine (His-3). isoleucine (He;!), 
30 leucine (Leu;L), lysine (Lys;K), methionine (Met;M), 
phenylalanine (Phe;P), proline (Pro;P), serine (Ser;S). threonine 
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(Thr;T), tryptophan (Trp;W), tyrosine (Tyr;Y), and valine (Val;V). 
An "X" is iised when the amino acid residue is unknown and 
parentheses designate that an unambiguous assignment is not 
possible and the amino add designation within the parentheses is 

5 the most probable estimate based on known information. 

The remaining Phenyl Superose EPSPS pool was 
dialyzed against 50 mM TrisCl, 2 mM DTT, 10 mM KCl. 10% 
^ycerol. pH 7.5 (2x1 L). An aKquot (0.55 ml. 0.61 mg protein) was 
loaded (1 ml/min) onto a Mono Q 5/5 column (Pharmacia) 

10 . equiHbrated with Q Sepharose bufiTer, washed with the same buffer 
>(5 ml), and eluted with a linear gradient of Q Sepharose^ buffer 
going from 0-0.14 M in 10 minutes, then holding at 0.14 M KQ 
(1 ml/min). Fractions were collected (1 ml) and assayed for EPSPS 
activity by the phosphate release assay and were subjected to 

15 SDSPAGE (10-15%, Phast System, Pharmacia, with silver 
staining) to determine protein purity. Fractions exhibiting a 
single band of protein by SDS-PAGE (22-25, 222 ng) were pooled 
and dialyzed against 100 mM ammonium bicarbonate, pH 8.1 (2 x 
1 L, 9 hours). 



20 



r|YYP^in TTly"« "ep t i if «oq»*^nciTipr pf Aprobacterhm f^v strain 

To the resulting pure Agrobacterium sp. strain CP4 
EPSPS (111 ^g) was added 3 ^ig of trypsin (Calbiochem). and the 

25 trypsinolysis reaction was allowed to proceed for 16 hours at 37*»C. 
The tryptic digest was then chromatographed (Iml/min) on a C18 
reverse phase HPLC column (Vydac) as previously described in 
Padgette et al. 1988 fori?, coli EPSPS. For all peptide purifications, 
0.1% trifluoroacetic acid (TFA, Herce) was designated buffer 

30 "KP-A" and 0.1% TFA in acetonitrile was buffer "KP-B". The 
gradient used for elution of the trypsinized Agrobacterium sp, CP4 
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EPSPS was: 0-8 minutes, 0% RP-B; 8-28 minutes, 0-15% RP-B; 
28-40 minutes, 15-21% RP-B; 40-68 minutes, 21-49% RP-B; 68-72 
minutes. 49-75% RP-B; 72-74 minutes, 75-100% RP-B. Fractions 
were collected (1 ml) and, based on the elution profile at 210 nm, at 
5 least 70 distinct peptides were produced firom the trypsinized 
EPSPS. Fractions 40-70 were evaporated to dryness and 
redissolved in 150 jjQ each of 10% acetonitrile, 0.1% trifluoroacetic 
acid. 

The fraction 61 peptide was further purified on the 

10 C18 column by the gradient: 0-5 minutes, 0% RP-B; 5-10 minutes, 
0-38% RP-B; 10-30 minutes, 38-45% B. Fractions were collected 
based on the UV signal at 210 nm. A large peptide peak in fi^ction 
24 eluted at 42% RP-B and was dried down, resuspended as 
described above, and rechromatographed on the C18 column with 

15 the gradient: 0-5 minutes, 0% RP-B; 5-12 min, 0-38% RP-B; 12-15 
min, 38-39% RP-B; 15-18 minutes, 39% RP-B; 18-20 minutes, 
39-41% RP-B; 20-24 minutes, 41% RP-B; 24-28 minutes, 42% RP-B. 
The peptide in fraction 25, eluting at 41% RP-B and designated 
peptide 61-24-25, was subjected to N-terminal amino acid 

20 sequeacing, and the following sequence was determined: 

APSM(D(D)EYPILAV (SEQ ID NO:19). 
The CP4 EPSPS fraction 53 tiyptic peptide was further purified by 
C18 HPLC by the gradient 0% B (5 minutes), 0-30% B (5-17 
minutes), 30-40% B (17-37 minutes). The peptide in firaction 28, 

25 eluting at 34% B and designated peptide 53-28, was subjected to 
N-terminal amino acid sequencing, and the following sequence 
was determined: 

ITGLLEGEDVmTGK (SEQ ID NO: 20). 



30 In order to verify the CP4 EPSPS cosmid clone, a 

number of oligonucleotide probes were designed on the basis of the 
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sequence of two of the tryptic sequences from the CP4 enzyme 
(Table HI). The probe identified as MID was very low degeneracy 
and was used for initial screening. The probes identified as 
EDV-C and EDV-T were based on the same amino add sequences 
6 and differ in one position (underlined in Table m below) and were 
used as confirmatory probes, with a positive to be expected only 
from one of these two probes. In the oligonucleotides below, 
alternate acceptable nucleotides at a particular position are 
designated by a T such as A/C/T. 

10 

PEPTIDE 61-24-25 APSM(IXD)EYPILAV (SEQ ID NO:19) 
Probe MID; 17-mer; mixed probe; 24-fold degenerate 
15 ATGATA/C/TGAC/TGAG/ATAC/TCC (SEQ ID NO:21) 
PEPTIDE 53-28 ITGLLEGEDVINTGK (SEQ ID NO:20) 
Probe EDV-C; 17-mer; mixed probe; 48-fold degenerate 
GAA/GGAC/TGTA/C/G/TATA/C/TAACAC (SEQ ID NO:22) 
Probe EDV-T; 17-mer, mixed probe; 48-fold degenerate 
20 GAA/GGAC/TGTA/C/G/TATA/C/TAAIAC (SEQ ID NO:23) 

The probes were labeled using gamma-32P-ATP and 
polynucleotide kinase. DNA from fourteen of the cosmids 
described above was restricted with EcoBl, transferred to 

25 membrane and probed with the olignucleotide probes. The 
conditions used were as follows: prehybridization was carried out 
in 6X SSC, lOX Denhardt's for 2-18 hour periods at 60°C, and 
hybridization was for 48-72 hours in 6X SSC, lOX Denhardt's, 100 
Hg/ml tElNA at 10**C below the Ta for the probe. The Ta of the probe 

30 was approxhnated by the formula 2«»C x (A+T) + 4''C x (G+C). The 
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filters were then washed three times with 6X SSC for ten minutes 
each at room temperature, dried and autoradiographed. Using 
the MID probe, an -9.9 kb fragment in the pMON17076 cosmid 
gave the only positive signal. This cosmid DNA was then probed 

5 with the EDV-C (SEQ ID NO:22) and EDV-T (SEQ ID NO:23) probes 
separately and again this -9.9 kb band gave a signal and only with 
the EDV-T probe. 

The combined data on the gljrphosate tolerant 
phenotype, the complementation of the E. coli aroA- phenotype, the 

10 expression of a -45 Kd protein, and the hybridization to two probes 
derived from;, the CP4 EPSPS amino acid sequence strongly 
suggested that the pMON17076 cosmid contained the EPSPS gene. 

T ^rali^atioTi and sub nlnmnpr of the CP4 EPSPS gene 

15 The CP4 EPSPS gene was further localized as foUows: 

a number of additional Southern analyses were carried out on 
different restriction digests of pMON17076 using the MID (SEQ ID 
NO:21) and EDV-T (SEQ ID NO:23) probes separately. Based on 
these analyses and on subsequent detailed restriction mapping of 

20 the pBlueScript (Stratagene) subclones of the -9.9 kb fragment 
from pMON17076, a 3.8 kb £coRI-SaZI fragment was identified to ' 
which both probes hybridized. This analysis also showed that MID 
(SEQ ID NO:21) and EDV-T (SEQ ID NO:23) probes hybridized to 
different sides of BamlH, Clal, and SocII sites. This 3.8 kb 

25 fragment was cloned in both orientations in pBlueScript to form 
pMON17081 and pMON17082. The phenotypes imparted to E. coli 
by these clones were then determined. Glyphosate tolerance was 
determined following transformation into E. coli MM294 
containing pGPl-2 (pBlueScript also contains a T7 promoter) on 

30 M9 agar media containing glyphosate at 3 mM. Both pMON17081 
and pMON17082 showed glyphosate tolerant colonies at three days 
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at 30°C at about half the size of the controls on the same media 
lacking glyphosate. This result suggested that the 3.8 kb fragment 
contained an intact EPSPS gene. The apparent lack of 
orientation-dependence of this phenotype could be explained by the 

5 presence of the T7 promoter at one side of the cloning sites and the 
lac promoter at the other. The aroA phenotype was determined in 
transformants of E. coli GBIOO on M9 agar media lacking 
aromatic supplements. In this experiment, carried out with and 
without the Plac inducer IPTG, pMON17082 showed much greater 

10 growth than pMON17081. suggesting that the EPSPS gene was 
expressed from the Sail site- towards the EcdRl site. 

Nucleotide sequencing was begun from a number of 
restriction site ends, including the BamSl site discussed above. 
Sequences encoding protein sequences that closely matched the 

15 N-terminus protein sequence and that for the tryptic fragment 
53-28 (SEQ ID NO:20) (the basis of the EDV-T probe) (SEQ ID 
NO:23)were localized to the Sail side of this BamBI site. These 
data provided conclusive evidence for the cloning of the CP4 
EPSPS gene and for the direction of transcription of this gene. 

20 These data coupled with the restriction mapping data also 
indicated that the complete gene was located on an -2.3 kb Xhol 
fragment and this fragment was subdoned into pBlueScript. The 
nucleotide sequence of almost 2 kb of this fragment was 
determined by a combination of sequencing from cloned 

25 restriction fragments and by the use of specific primers to extend 
the sequence. The nucleotide sequence of the CP4 EPSPS gene 
and flanking regions is shown in Figure 3 (SEQ ID NO:2). The 
sequence corresponding to peptide 61-24-25 (SEQ ID NO:19) was 
also located. The sequexice was determined using both the 

30 Sequenase kit from IBI (International Biotechnologies Inc.) and 
the T7 sequencing /Deaza Kit fit)m Pharmacia. 
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That the cloned gene encoded the EPSPS activity 
purified fi-om the Agrobapterium sp. strain CP4 was verified in the 
foUowing manner: By a series of site directed mutageneses, BgUl 
and Ncol sites were placed at the N-terminus with the fMet 

5 contained within the Ncol recognition sequence, the first internal 
Ncol site was removed (the second internal Ncol site was removed 
later), and a Sad site was placed after the stop codons. At a later 
stage the internal Notl site was also removed by site-directed 
mutagenesis. The foUowing Ust includes the primers for the 

10 site-directed mutagenesis (addition or removal of restriction sites) 
of the CP4 EPSPS gene. Mutagenesis was carried out by the 
procedures of Kunkel et al. (1987), essentially as described in 
Sambrook et al. (1989). 

15 PRTTVTER BgNc (addition ofBgUL and Ncol sites to N-terminus) 
CGTGGATAGATCTAGGAAGACAACCATGGCTCACGGTC 

(SEQIDNO:24) 

PRTMTgR Sph2 (addition oiSphl site to N-terminus) 
20 GGATAGATTAAGGAAGACGCGCATGCTTCACGGTGCAAGC 

AGCC (SEQ ID NO:25) 

PRTTVTKR SI (addition of SacI site immediately after stop codons) 
GGCTGCCTGATGAGCTCCACAATCGCCATCGATGG 

25 (SEQIDNO:26) 

pRTMER Nl (removal of internal Noil recognition site) 
CGTCGCTCGTCGTGCGTGGCCGCCCTGACGGC 

(SEQIDNO:27) 

30 
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PRTMTCR Ncol (removal of first internal iVcoI recognition site) 
CGGGCAAGGCCATGCAGGCTATGGGCGCC (SEQ ID NO:28) 

PRIMER Nco2 (removal of second internal Ncol recognition site) 
5 CGGGCTGCCGCCTGACTATGGGCCTCGTCGG (SEQ ID NO:29) 

This CP4 EPSPS gene was then cloned as a 
NcoI-BamHI N-terminal firagment plus a BamHI-Sacl C-terminal 
firagment into a PrecA-genelOL expression vector similar to those 

10 described (Wong et al., 1988; Olins et al., 1988) to form pMONlTlOl. 
The Km for PEP and the Ki for glyphosate were determined for the 
EPSPS activity in crude lysates of pMON17101/GB100 
transformants following induction with nalidixic acid (Wong et 
al., 1988) and found to be the same as that determined for the 

15 purified and crude enzyme preparations firom Agrobacterium sp. 
strain CP4. 

r!liararteri7.afaon of thp TCPSPS prftne from Achromobocter BP. 
strain T.RAA and frnTn Pseudomnnm: en. strain PG2982 

20 A cosmid bank of partially jyindlll-restricted LBAA 

DNA was constricted in E. coli MM294 in the vector pHC79 (Hohn 
and CoUins, 1980). This bank was probed with a fiill length CP4 
EPSPS gene probe by colony hybridization and positive clones were 
identified at a rate of ~1 per 400 cosmids. The LBAA EPSPS gene 

25 was fiirther localized in these cosmids by Southern analysis. The 
gene was located on an ~2.8 kb Xhol firagment and by a series of 
sequencing steps, both from restriction fragment ends and by 
using the oligonucleotide primers firom the sequencing of the CP4 
EPSPS gene, the nucleotide sequence of the LBAA EPSPS gene was 

30 completed and is presented in Figure 4 (SEQ ID NO:4). 
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The EPSPS gene from PG2982 was also cloned. The 
EPSPS protein was purified, essentially as described for the CP4 
enzyme, with the following differences: Following the Sepharose 
CL-4B column, the fractions with the highest EPSPS activiiy were 

5 pooled and the protein precipitated by adding soHd ammonium 
sulfate to 85% saturation and stirring far 1 hour. The precipitated 
protein was collected by centrifugation, resuspended in Q 
Sepharose buffer and following dialysis against the same buffer 
was loaded onto the column (as for the CP4 enzyme). After ^ 

10 purification on the Q Sepharose column, -40 mg of protein in 100 ^ 
mM Tris pH 7.8, 10% glycerol, 1 mM EDTA, 1 mM DTT, and 1 M 
ammonium sulfate, was loaded onto a Phenyl Superose 
(Pharmacia) coltimn. The column was eluted at 1.0 ml/minutes 
with a 40 ml gradient from 1.0 M to 0.00 M ammonium sulfate in 

15 the above buffer. 

Approximately 1.0 mg of protein from the active 
fractions of the Phenyl Superose 10/10 column was loaded onto a 
Pharmacia Mono P 5/10 Chromatofocusing column with a flow 
rate of 0.75 ml/minutes. The starting buffer was 25 mM bis-Tris at 

20 pH 6.3, and the column was eluted with 39 ml of Polybuffer 74, pH 
4.0. Approximately 50 Jig of the peak fraction from the 
Chromatofocusing colunm was dialyzed into 25 mM ammonium 
bicarbonate. This sample was then used to determine the 
N-terminal amino add sequence. 

25 The N-terminal sequence obtained was: 

XHSASPKPATARRSE (where X = an unidentified 
residue) (SEQ ID NO:30). A number of degenerate oligonucleotide 
probes were designed based on this sequence and used to probe a 
Ubrary of PG2982 partial-Hi ndHI DNA in the cosmid pHC79 
30 (Hohn and Collins, 1980) by colony hybridization under 
nonstringent conditions. Final washing conditions were 15 
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minutes with IX SSC, 0.1% SDS at SS^C. One probe with the 
sequence GCGGTBGCSGGYTTSGG (where B = C, G, or T; S = C 
or G, and Y = C or T) (SEQ ID NO:31) identified a set of cosmid 
clones. 

5 The cosmid set identified in this way was made up of 

cosmids of diverse HindJll firagments. However, when this set 
was probed with the CP4 EPSPS gene probe, a cosmid containing 
the PG2982 EPSPS gene was identified (designated as cosmid 9C1 
originaUy and later as pMON20107). By a series of restriction 

10 mappings and Southern analysis this gene was locaHzed to a -^.8 
kb iX^feoI fi-agment and the nucleotide sequence of this gene was . 
determined. This DNA sequence (SEQ ID NO:6) is shown in 
Figure 5 There are no nucleotide differences between the EPSPS 
gene sequences fi-om LBAA (SEQ ID NO:4) and PG2982 (SEQ ID 

15 NO:6). The kinetic parameters of the two enzymes are within the 

range of experimental error. 

A gene from PG2982 that imparts glyphosate 
tolerance in E. coli has been sequenced (Fitzgibbon. 1988; 
Fitzgibbon and Braymer, 1990). The sequence of the PG2982 EPSPS 
20 Class II gene shows no homology to the previously reported 
sequence suggesting that the glyphosate tolerant phenotype of the 
previous work is not related to EPSPS. 

^l^oTT^p ^-vp, Tsnlati o n Pmt.nrn1fi for Other ClflRP TT EPSPS 

25 PtrP^^^^«T Genes 

A number of Class H genes have been isolated and 
described here. It is clear that the initial gene cloning, that of the 
gene from CP4. was difficult due to the low degree of similarity 
between the Class I and Class H enzymes and genes. The 

30 identification of the other genes however was greatly fedUtated by 
the use of this first gene as a probe. In the cloning of the LBAA 
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EPSPS gene, the CP4 gene probe allowed the rapid identification of 
cosmid clones and the localization of the intact gene to a small 
restriction fragment and some of the CP4 sequencing primers 
were also used to sequence the LBAA (and PG2982) EPSPS gene(s). 

5 The CP4 gene probe was also used to confirm the PG2982 gene 
done. The high degree of similarity of the Class II EPSPS genes 
may be used to identify and clone additional genes in much the 
same way that Class I EPSPS gene probes have been used to done 
other Class I genes. An example of the latter was in the cloning of 

10 the A thaliana EPSPS gene using the P. hybrida gene as a probe 
(Kleeetal., 1987). ■ ■j^^^.. 

Glyphosate tolerant EPSPS activiiy has been reported 
previously for EPSP synthases from a number of sotirces. These 
enzjrmes have not been diaracterlzed to any extent in most cases. 

15 The use of Class I and Class 11 EPSPS gene probes or antibody 
probes provide a rapid means of initially screening for the nature 
of the EPSPS and provide tools for the rapid cloning and 
characterization of the genes for such enzymes. 

Two of the three genes described were isolated firom 

20 bacteria that were isolated from a glyphosate treatment facility 
(Strains CP4 ,and LBAA). The third (PG2982) was from a 
bacterium that had been isolated from a culture collection strain. 
This latter isolation suggests that closure to glyphosate may not 
be a prerequisite for the isolation of high gljrphosate tolerant 

25 EPSPS enzymes and that the screening of collections of bacteria 
could yield additional isolates. It is possible to enrich for 
glyphosate degrading or glyphosate resistant microbial 
populations (Quinn et al., 1988; Talbot et al., 1984) in cases where it 
was felt that enrichment for such microorganisms would enhance 

30 the isolation frequency of Class 11 EPSPS microorganisms. 
Additional bacteria containing dass II EPSPS gene have also been 
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identified. A bacterium called C12, isolated from the same 
treatment column beads as CP4 (see above) but in a medium in 
which glyphosate was supplied as both the carbon and phosphorus 
source, was shown by Southern analysis to hybridize with a probe 

5 consisting of the CP4 EPSPS coding sequence. This result, in 
conjunction with that for strain LBAA, suggests that this 
enrichment method facilitates the identification of .Class II EPSPS 
isolates. New bacterial isolates containing Class 11 EPSPS genes 
have also been identified from environments other than 

10 glyphosate waste treatment facilities. An inoculxim was prepared 
by extracting soil (from a recently harvested soybean field in 
Jerseyville, Illinois) and a population of bacteria selected by 
growth at 28oC in Dworkin-Poster medium containing glyphosate 
at 10 mM as a source of carbon (and with cycloheximide at 100 

15 ng/ml to prevent the growth of fimgi). Upon plating on L-agar 
media, five colony types were identified. Chromosomal DNA was 
prepared from 2ml L-broth cultures of these isolates and the 
presence of a Class II EPSPS gene was probed using a the CP4 
EPSPS coding sequence probe by Southern analysis under 

20 stringent hybridization and washing conditions. One of the soil 
isolates, S2, was positive by this screen. 

TtAlafinnsbins hetwff pn diffarent EPSPS geneS 

The deduced amino acid sequences of a number of 
25 Class I and the Class n EPSPS enzymes were compared using the 
Bestfit computer program provided in the UWGCG package 
(Devereux et al. 1984). The degree of similarity and identity as 
determined using this program is reported. The degree of 
similarityAdentity determined within Class I and Class II protein 
30 sequences is remarkably high, for instance, comparing E. coli 
with S. typhimurium (similarity/identity = 93%/88%) and even 
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comparingE. coli with a plant EPSPS (.Petunia hybrida ; 72%/55%). 
This data is shown in Table IV. The comparison of sequences 
between Class I and Class n, however, shows only a very low 
degree of relatedness between the Classes (similarity/identity = 

5 50-53%/23-30%). The display of the Bestfit analysis for the E.coli 
(SEQ ID NO:8) and CP4 (SEQ ID N0:3) sequences shows the 
positions of the conserved residues and is presented in Figure 6. 
Previous analyses of EPSPS sequences had noted the high degree 
of conservation of sequences of the enzymes and the almost 

10 invaj^ance of sequences in two regions - the "20-35" and "95-107" 
regions. (Gasser et al., 1988; numbered according to the Petunia??^^ 
EPSPS sequence) - and these regions are less conserved in the case 
of CP4 and LBAA when compared to Class I bacterial and plant 
EPSPS sequences (see Figure 6 for a comparison of the E. coli and 

15 CP4 EPSPS sequences with the E. coli sequence appearing as the 
top sequence in the Figure). The corresponding sequences in the 
CP4 Class n EPSPS are: 

PGDKSISHRSFMFGGL (SEQ ID NO:32) and LDFGNAATGCRLT 
(SEQIDNO:33). 

20 

These comparisons show that the overall relatedness 
of Class I and Class n is EPSPS proteins is low and that sequences 
in putative conserved regions have also diverged considerably. 

In the CP4 EPSPS an alanine residue is present at the 
25 "glydnelOl" position. The replacement of the conserved glycine 
(from the "95-107" region) by an alanine results in an elevated Ki 
for glyphosate and in an elevation in the K^i for PEP in Class I 
EPSPS. In the case of the CP4 EPSPS, which contains an alanine 
at this position, the for PEP is in the low range, indicating that 

30 

the Class II enzymes differ in many aspects from iJae EPSPS 
enzymes heretofore characterized. 
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Within the Class II isolates, the degree of 
similarityyidentity is as high as that noted for that within Qass I 
(Table IV). Figure 7 displays the Bestfit computer program 
alignment of the CP4 (SEQ ID N0:3) and LBAA (SEQ ID N0:5) 
5 EPSPS deduced amino acid sequences with the CP4 sequence 
appearing as the top sequence in the Figure. The symbols used in 
Figures 6 and 7 are the standard symbols used in the Bestfit 
computer program to designate degrees of similarity and identity. 

10 TPiItIt* TV r^r»i^snn of »*>inteHnflsfi of TCPSPS PToteiii secmeBces i 

rnmnarison i^irc^ry niflgg T ntiil Class n KPSPS PTOtein 

gftmienoes 



25 





similaritY. 


identitv 


E. coli vs. CP4 


52.8 


26.3 


15 E. coli vs. LBAA 


52.1 


26.7 


S. typhimurium vs. CP4 


51.8 


25.8 


B. pertussis vs. CP4 


52.8 


27.3 


S. cerevisiae vs. CP4 


53.5 


29.9 


P. hybrida vs. CP4 


50.2 


23.4 


20 







CMmnaristm lifttwBBii Cl««fi T EPSPS nrotean seoueno^ 

similarity idffltitY 
E. coli vs. S. typhimurium 93.0 88.3 

P. hybrida vs. E. coli 71.9 54.5 



rnmpayisnn hetwfiP ^ riaos H EPSPS nrotein seouenoes 

Rimilaritv IdgfltitY 
Agrobacterium sp. strain CP4 
vs. Achromobacter sp. 
30 strain LBAA 89.9 83.7 
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1 The EPSPS sequences compared here were obtained from the 
following references: E. coli, Rogers et al., 1983; S. typhimurium. 
Stalker et al., 1985; Petunia hybrida. Shah et al., 1986; B. pertussis, 
Maskell et aL, 1988; and S. cereoisiae, Duncan et a!., 1987. 

One difference that may be noted between the deduced 
amino acid sequences of the CP4 and LBAA EPSPS proteins is at 
position 100 where an Alanine is found in the case of the CP4 
enzyme and a Glycine is found in the case of the LBAA enzyme. 
In the Class I EPSPS enzymes a Glycine is usually found in the 
eqmvalent position, i.e Glycine96 in E. coll and pneumoniae and 
GlycinelOl in Petimia. In the case of these three enzymes it has 
been reported that converting that Glycine to an Alanine results in 
an elevation of the appEl for glyphosate and a concomitant 
elevation in the appKm for PEP (Rishore et al. 1986; Kishore and 
Shah, 1988; Sost and Amrhein, 1990), which, as discussed above, 
makes the enzyme less ef&cient especially under conditions of 
lower PEP concentrations. The GlycinelOO of the LBAA EPSPS 
was converted to an Alanine and both the appEjn for PEP and the 
appKi for glyphosate were determined for the variant. The 
GlycinelOOAlanine change was introduced by mutagenesis using 
the following primer: 

CGGCAATGCCGCCACCGGCGCGCGCC (SEQ ID NO:34) 
and both the wild type and variant genes were expressed in E. coli 
in a RecA promoter expression vector (pMON17201 and 
pMON17264, respectively) and the appKm's and appKi's 
determined in crude lysates. The data indicate that the 
appKiCglyphosate) for the GIOOA variant is elevated about 16-fold 
(Table V). This result is in agreement with the observation of the 
importance of this G-A change in raising the app!^(glyphosate) in 
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the Class I EPSPS enzymes. However, in contrast to the results in 
the CTass I G-A variants, the appKmCPEP) in the Class H (LBAA) 
G-A variant is unaltered. This provides yet another distinction 
between the Class II and Class I EPSPS enzymes. 

5 

apnKmnPEP) nppKK glYphpsate) 

Lysate prepared from: 
10 E. co/i/pMON172dl (wild type) 513 28 nM* = 

c6/t/i>MON17264 5.5 nM 459 mM# 1 

(GIOOA variant) 

@ range of PEP: 2-40 
15 * range of glyphosate: 0-310 nM; # range of glyphosate: 0-5000 

The LBAA GIOOA variant, by virtue of its superior kinetic 
properties, is capable of imparting improved glyphosate in planta. 

20 MoHificatioTi anH Resyn flips^fi nf thti Aerobacterium rp. strain CP4 
tgPRPS Genp Sequence 

The EPSPS gene from Agrobacterium sp. strain CP4 
contains sequences that could be inimical to high expression of the 
gene in plants. These sequences include potential polyadenylation 

25 sites that are often and A+T rich, a higher G+C% than that 
frequently found in plant genes (63% versus -50%), concentrated 
stretches of G and C residues, and codons that are not used 
frequently in plant genes. The high G+C% in the CP4 EPSPS gene 
has a number of potential consequences including the following: a 

30 higher usage of G or C than that found in plant genes in the third 
position in codons, and the potential to form strong hair-pin 



i 
r 

1 
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Btructures that may affect expression or stabiHty of the RNA. The 
reduction in the G+C content of the CP4 EPSPS gene, the 
disruption of stretches of G's and C's, the elimination of potential 
polyadenylation sequences, and improvements in the codon usage 
5 to that used more frequently in plant genes, could result in higher 
expression of the CP4 EPSPS gene in plants. 

A synthetic CP4 gene was designed to change as 
completely as possible those inimical sequences discussed above. 
In summary, the gene sequence was redesigned to eliminate as 
10 much as possible the foUowing sequences or sequence features 
(while avoiding the introduction of unnecessary restriction sites): 
stretches of G's and C's of 5 or greater; and A+T rich regions 
(predominantiy) that could function as polyadenylation sites or 
potential RNA destabiHzation region The sequence of this gene is 
15 shown in Figure 8 (SEQ ID NO:9). This coding sequence was 
expressed in E. coli from the RecA promoter and assayed for 
EPSPS activity and compared with that from the native CP4 EPSPS 
gene. The apparent Rm for PEP for the native and synthetic genes 
was 11.8 and 12.7. respectively, indicating that the enzyme 
expressed from the synthetic gene was unaltered. The 
N-terminus of the coding sequence was mutagenized to place an 
SphI site at the ATG to permit the construction of the CTP2-CP4 
synthetic ftision for chloroplast import. The foUowing primer was 
used to accomplish this mutagenesis: 

GGACGGCTGCTTGCACCGTGAAGCATGCTTAAGCTTGGCGT 
AATCATGG (SEQ ID NO:35). 



20 



25 



y. ^r-ggfiinn of Cihloro plast. DirP-cted CP4 EPSPS 

The glyphosate target in plants, the 
30 5-«7ioZpyruvyl-shikimate-3-phosphate synthase (EPSPS) enzyme, 
is located in the chloroplast. Many chloroplast-localized proteins. 
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induding EPSPS, are expressed from nuclear genes as precursors 
and are targeted to the chloroplast by a chloroplast transit peptide 
(OTP) that is removed during the import steps. Examples of other 
such chloroplast proteins include the small subunit (SSU) of 

6 Ribulose-l,5-bisphosphate carboxylase (RUBISCO). Ferredoxin. 
Perredoxin oxidoreductase, the Light-harvesting-complex protein 
I and protein II, and Thioredoxin F. It has been demonstrated in 
vivo and in vitro that non-chloroplast proteins may be targeted to 
the chloroplast by use of protein fusions with a OTP and that a CTP 

10 sequence is sufficient to target a protein to ;the chloroplast. 

A CTP-CP4 EPSPS fusion was constructed between 
the Arahidopsis thaliana EPSPS CTP (Klee et al., 1987) and the CP4 
EPSPS coding sequences. The Arahidopsis CTP was engineered by 
site-directed mutagenesis to place a Sphl restriction site at the 

15 CTP processing site. This mutagenesis replaced the Glu-Lys at 
this location with Cys-Met. The sequence of this CTP, designated 
as CTP2 (SEQ ID NO:10), is shown in Figure 9. The N-terminus of 
the CP4 EPSPS gene was modified to place a Sphl site that spans 
the Met codon. The second codon was converted to one for leucine 

20 in this step also. This change had no apparent eflfect on the in vivo 
activity of CP4 EPSPS in E. coli >Si8 judged by rate of 
complementation of the aroA allele. This modified N-terminus 
was then combined with the Sad C-terminus and cloned 
downstream of the CTP2 sequences. The CTP2-CP4 EPSPS fusion 

25 was cloned into pBlueScriptKS(+). This vector may be transcribed 
in vitro using the T7 polymerase and the RNA translated with 
35S-Methionine to provide material that may be evaluated for 
import into chloroplasts isolated from Lactuca sativa using the 
methods described hereinafter (della-Cioppa et al., 1986, 1987). 

30 This template was transcribed in vitro using T7 polymerase and 
the 35S-methionine-labeled CTP2-CP4 EPSPS material was shown 
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to import into chloroplasts with an efficiency comparable to that 
for the control Petunia EPSPS (control = 35S labeled PreEPSPS 
|pMON6140; della-Cioppa et al., 19863). 

In another example the Arabidopsis EPSPS CTP, 

5 designated as CTP3, was fused to the CP4 EPSPS through an 
EcoBI site. The sequence of this CTP3 (SEQ ID NO:12) is shown in 
Figure 10. An EcoBI site was introduced into the Arabidopsis 
EPSPS mature region around amino acid 27, replacing the 
sequence -Arg-Ala-Leu-Leu- with -Arg-Ile-Leu-Leu- in the 

10 process. The primer of the following sequence was used to modify 
tiie N-terminus of the CP4 EPSPS gene to add an EcoBI site to 
effect the fusion to the CTP3: 

GGAAGACGCCCAfiAAXECACGGTGCAAGCAGCCGG 
(SEQ ID NO:36) (the ^coRI site is underlined). 

15 This CTP3-CP4 EPSPS fusion was also cloned into the pBlueScript 
vector and the T7 expressed fusion was found to also import into 
chloroplasts with an efficiency comparable to that for the control 
Petunia EPSPS (pMON6140). 

A related series of OTPs, designated as CTP4 CSphI) 

20 and CTP5 (EcoBI), based on the Petunia EPSPS OTP and gene 
were also fused to the Sphl- and EcoRI-modified CP4 EPSPS gene 
sequences. The Sphl site was added by site-directed mutagenesis 
to place this restriction site (and change the amino add sequence 
to -Cys-Met-) at the cUoroplast processing site. All of the C5TP-CP4 

25 EPSPS fusions were shown to import into chloroplasts with 
approximately equal efficiency. The CTP4 (SEQ ID NO:14) and 
CTP5 (SEQ ID NO: 16) sequences are shown in Figures 11 and 12. 

A CTP2-LBAA EPSPS fusion was also constructed 
following the modification of the N-terminus of the LBAA EPSPS 

30 geneby the addition of a SpAI site. This fusion was also found to be 
imported efficiently into chloroplasts. 
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By similar approaches, the CTP2-CP4 EPSPS and the 
CTP4-CP4 EPSPS fusion- have also been shown to import efficiently 
into chloroplasts prepared from the leaf sheaths of com. These 
results indicate that these CTP-CP4 fusions could also provide 

6 useful genes to impart glyphosate tolerance in monocot species. 

Those skilled in the art will recognize that various 
chimeric constructs can be made which utilize the functionality of 
a particular CTP to import a Class n EPSPS enzyme into the plant 
ceU chloroplast. The chloroplast import of the Class II EPSPS can 

10 be determined using the following assay. 

nhlftrnplast Uptake Assay 

Intact chloroplasts are isolated from lettuce (Latuca 
sativa, var. longifoUa) by centrifugation in Percoll/ficoll gradients 

16 as modified from Bartlett et al (1982). The fmal pellet of intact 
chloroplasts is suspended in 0.5 ml of sterile 330 mM sorbitol in 50 
mM Hepes-KOH, pH 7.7, assayed for chlorophyll (Amon, 1949). 
and adjusted to the final chlorophyll concentration of 4 mg/ml 
(using sorbitol/Hepes). The yield of intact chloroplasts from a 

20 single head of leUuce is 3-6mg chlorophyll. 

A typidil ^00 Ml uptake experiment contained 5 mM 
ATP. 8.3 mM unlabeled methionine. 322 mM sorbitol. 58.3 mM 
Hepes-KOH (pH 8.0), 50 pi reticulocyte lysate translation products, 
and intact chloroplasts from L. sativa (200 jig chlorophyll). The 

25 uptake mixture is gently rocked at room temperature (in 10 x 75 
mm glass tubes) directly in front of a fiber optic illuminator set at 
inairimiiTn light intensity (150 Watt bulb). Aliquot samples of the 
uptake mix (about 50 \iD are removed at various times and 
fractionated over 100 M-l silicone-oil gradients (in 150 \lI 

30 polyethylene tubes) by centrifugation at 11,000 X g for 30 seconds. 
Under these conditions, the intact chloroplasts form a pellet under 
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the silicone-oil layer and the incubation xnediiixn (containing the 
reticulocyte lysate) floats on the surface. After centrifugation, the 
silicone-oil gradients are immediately &ozen in dry ice. The 
chloroplast pellet is then resuspended in 50-100 jil of lysis bufiFer (10 

5 mM Hepes-KOH pH 7.5, 1 roM PMSP, 1 mM benzamidine, 5 mM 
e-amino-n-caproic add, and 30 jtg/ml aprotinin) and centrifuged 
at 15,000 X g for 20 minutes to pellet the thylakoid membranes. 
The clear supernatant (stromal proteins) from this spin, and an 
aliquot of the reticulocyte Ijrsate incubation medium from each 

10 uptake experiment, are mixed with an equal ,^olume of 2X 
SDS-PAGE sample buffer for electrophoresis (Laemmli, 1970). 

SDS-PAGE is carried out according to Laemmli (1970) 
in 3-17% (w/v) acrylamide slab gels (60 mm X 1.5 mm) with 3% 
(w/v) acrylamide stacking gels (5 mm X 1.5 mm). The gel is fixed 

15 for 20-30 rnm in a solution with 40% methanol and 10% acetic add. 
Then, the gel is soaked in EN3HANCE™ (DuPont) for 20-30 
minutes, followed by drying the gel on a gel dryer. The gel is 
imaged by autoradiography, using an intensifying screen and an 
overnight exposure to determine whether the CP4 EPSPS is 

20 imi>orted into the isolated chloroplasts. 

PT.ANT TRANSFORMATION 



Plants which can be made gljnphosate tolerant by 
^ practice of the present invention include, but are not limited to, 
soybean, cotton, com, canola, oil seed rape, flax, sugarbeet, 
simflower, potato, tobacco, tomato, wheat, rice, alfalfa and lettuce 
as well as various tree, nut and vine species. 

A double-stranded DNA molecule of the present 
30 invention C'chimeiic gene") can be inserted into the genome of a 
plant by any smtable method. Smtable plant transformation 
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vectors include those derived from a Ti plasmid of Agrobacterium 
tumefaciens, as weU as those disclosed, e.g., by Herrera-Estrella 
(1983), Sevan (1984), Klee (1985) and EPO pubUcation 120,516 
(Schilperoort et al.). In addition to plant transformation vectors 
derived from the Ti or root-inducing (Ri) plasmids of 
Agrobacterium, alternative methods can be used to insert the DNA 
constructs of this invention into plant cells. Such methods may 
involve, for example, the use of liposomes, electroporation. 
chemicals that increase free DNA uptake, free DNA deUvery via 
microprojectile bombardment, and transformation using viruses 

or pollen. 



niaRB TT TCPSPS p ipnt transformation vectOrS 

Class n EPSPS DNA sequences may be engineered 

15 into vectors capable of transforming plants by using known 
techniques. The following description is meant to be illustrative 
and not to be read in a limiting sense. One of ordinary skill in the 
art would know that other plasmids. vectors, markers, promoters, 
etc. would be used with suitable results. The CTP2-CP4 EPSPS 

20 fusion was cloned as a BglU-Ecom fragment into the plant vector 
pMON979 (described below) to form pMONlTllO, a map of which is 
presented in Figure 13. In this vector the CP4 gene is expressed 
from the enhanced CaMV35S promoter (E35S; Kay et al. 1987). A 
FMV35S promoter construct (pMON17116) was completed in the 

25 following way: The Sall-Notl and the Notl-BglU fragments from 
pMON979 containing the Spc/AAC(3)-III/o riV and the 
pBR322/Right Border/NOS 37CP4 EPSPS gene segment from 
pMON17110 were ligated with the Xhol-BglU FMV35S promoter 
fragment from pMON981. These vectors were introduced into 

30 tobacco, cotton and canola. 
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A series of vectors was also completed in the vector 
pMON977 in which the CP4 EPSPS gene, the CTP2-CP4 EPSPS 
fusion, and the CTP3-CP4 fusion were cloned as Bglll-Sacl 
fragments to form pMON17124, pMON17119. and pMON17120, 

5 respectively. These plasmids were introduced into tobacco. A 
pMON977 derivative containing the CTP2-LBAA EPSPS gene was 
also completed (pMON17206) and introduced into tobacco. 

The pMON979 plant transformation/expression vector 
was derived from pMON886 (described below) by replacing the ? 

10 neomycin phosphotransferase typell (KAN) gene in pMON886 
with the 0.89 kb fragment containing the bacterial 
gentamicin-3-N-acetyltransferase type III (AAC(3)-in) gene 
(Hayford et al., 1988). The chimeric P-35S/AA(3)-in/NOS 3' gene 
encodes gentamicin resistance which permits selection of 

15 transformed plant cells. pMON979 also contains a 0.96 kb 
expression cassette consisting of the enhanced CaMV 35S 
promoter (Kay et al., 1987), several unique restriction sites, and the 
NOS 3' end (P-En-CaMV35S/NOS 3*). The rest of the pMON979 
DNA segments are exactly the same as in pMON886. 

20 Plasmid pMON886 is made up of the following 

segments of DNA. The first is a 0.93 kb Aval to engineered-i^coRV 
fragment isolated from transposon Tn7 that encodes bacterial 
spectinomycin/streptomycin resistance (Spc/Str), which is a 
determinant for selection in E. coli and Agrohacterium 

25 tumefaciens. This is joined to the 1.61 kb segment of DNA 
encoding a chimeric kanamycin resistance which permits 
selection of transformed plant cells. The chimeric gene 
(P-35S/KAN/NOS 3') consists of the cauliflower mosaic virus 
(CaMV) 358 promoter, the neomycin phosphotransferase typell 

30 (KAN) gene, and the 3'-nontranslated region of the nopaline 
synthase gene (NOS 3') (Fraley et al., 1983). The next segment is 
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the 0.75 kb oriV containing the origin of replication from the RK2 
plasznid. It is joined to the 3.1 kb Sail to Pvul segment of pBR322 
(ori322) which provides the origin of repKcation for maintenance 
in E. coli and the bom site for the coiyugational transfer into the 

5 Agrobacterium tumefaciens cells. The next segment is the 0.36 kb 
Pvul to Bell from pTiT37 that carries the nopaline-type T-DNA 
right border (Praley et al., 1985). 

The pMON977 vector is the same as pMON981 except 
for the presence of the P-En-CaMV35S promoter in place of the 

10 FMV35S promoter (see below). 

The pMONSSl plasmid contains the following DNA 
segments: the 0.93 kb fragment isolated from transposon Tn7 
encoding bacterial spectinomycin/streptomycin resistance 
[Spc/Str; a determinant for selection in E. coli and Agrobacterium 

15 tumefaciens (FUng et al., 1985)]; the chimeric kanamycin 
resistance gene engineered for plant expression to allow selection 
of the transformed tissue, consisting of the 0.35 kb cauliflower 
mosaic virus 35S promoter (P-35S) (Odell et al.. 1985), the 0.83 kb 
neomycin phosphotransferase typell gene (KAN), and the 0.26 kb 

20 3'-nontranslated region of the nopaline synthase gene (NOS 3') 
(Praley et al., 1983); the 0.75 kb origin of replication from the RK2 
plasmid (orAO (Stalker et al., 1981); the 3.1 kb SaU. to Pvul segment 
of pBR322 which provides the origin of replication for maintenance 
in E. coli (ori-322) and the bom site for the coi^ugational transfer 

^ into the Agrobacterium tumefaciens cells, and the 0.36 kb Pvul to 
Bell fragment from the pTiT37 plasmid containing the 
nopaline-type T-DNA right border region (Praley et al., 1985). The 
expression cassette consists of the 0.6 kb 35S promoter from the 
figwort mosaic virus (P-FMV35S) 'Gowda et al., 1989) and the 0.7 

30 kb 3* non-translated region of the pea rbcS-E9 gene (E9 3') (Coruzzi 
et al., 1984, and MorelU et al., 1985). The 0.6 kb Sspl fragment 
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containing the FMV35S promoter (Figure 1) was engineered to 
place suitable cloning sites downstream of the transcriptional 
start site. The CTP2-CP4syn gene fusion was introduced into plant 
expression vectors (including pMON981. to form pMON17131; 
5 Figure 14) and transformed into tobacco, canola, potato, tomato. 
Bugarbeet, cotton, lettuce, cucumber, oil seed rape, poplar, and 
Arabidopsis. 

The plant vector containing the Class U EPSPS gene 
may be mobilized into any suitable Agrobacterium strain for 

ID transformation of the desired plant species. The plant vector may 
be mobilized into an ABI Agrobacterium strain. A suitable ABI 
strain is the A208 Agrobacterium tumefaciens carrying the 
disarmed Ti plasmid pTiCSB (pMP90RK) (Koncz and Schell, 1986). 
The Ti plasmid does not carry the T-DNA phytohormone genes 

15 and the strain is therefore unable to cause the crown gall disease. 
Mating of the plant vector into ABI was done by the triparental 
coAjugation system using the helper plasmid pRK2013 (Ditta et al., 
1980). When the plant tissue is incubated with the ABIrrplant 
vector conjugate, the vector is transferred to Ae plant cells by the 

20 wrr functions encoded by the disarmed pTiC58 plasmid. The vector 
opens at the T-DNA right border region, and the entire plant 
vector sequence may be inserted into the host plant chromosome. 
The pTiC58 Ti plasmid does not transfer to the plant cells but 
remains in the Agrobacterium. 

25 

mass n TCPSPS fi -Afi T)KA vectors 

Class n EPSPS genes may also be introduced into 
plants through direct delivery methods. A number of direct 
delivery vectors were completed for the CP4 EPSPS gene. The 
30 vector pMON13640, a map of which is presented in Figure 15, is 
described here. The plasmid vector is based on a pUC plasmid 
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(Vieira and Messing, 1987) containing, in this case, the nptU gene 
(kanamycin resistance; KAN) from Tn903 to provide a selectable 
marker in E. coli. The CTP4-EPSPS gene fusion is expressed from 
the P-FMV35S promoter and contains the NOS 3' polyadenylation 

5 sequence fragment and from a second cassette consisting of the 
E35S promoter, the CTP4-CP4 gene fusion and the NOS 3' 
sequences. The scoreable GUS marker gene (Jefferson et al. 1987) 
is expressed from the mannopine synthase promoter (P-MAS; 
Velten et al., 1984) and the soybean 7S storage protein gene 

10 3' sequences (Schuler et al., 1982). Similar plasmids could also be 
made in which CTP-CP4 EPSPS fusions are expressed from the 
enhanced CaMV35S promoter or other plant promoters. Other 
vectors could be made that are suitable for free DNA delivery into 
plants and such are within the skill of the art and contemplated to 

15 be within the scope of this disclosure. 

PT.ANT REGF.MTCRATIQN 

When expression of the Class II EPSPS gene is 
20 achieved in transformed cells (or protoplasts), the cells (or 
protoplasts) are regenerated into whole plants. Choice of 
methodology for the regeneration step is not critical, with suitable 
protocols being available for hosts from Leguminosae (alfalfa, 
soybean, clover, etc.), Umbelliferae (carrot, celery, parsnip), 
25 Cruciferae (cabbage, radish, rapeseed, etc.), Cucurbitaceae 
(melons and cucumber), Gramineae (wheat, rice, com, etc.), 
Solanaceae (potato, tobacco, tomato, peppers), various floral crops 
as vfeLl as various trees such as poplar or apple, nut crops or vine 
plants such as grapes. See, e.g., Ammirato, 1984; Shimamoto, 
30 1989; Fromm, 1990; Vasil, 1990. 
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The following examples are provided to better 
elucidate the practice of the present invention and should not be 
interpreted in any way to limit the scope of the present invention. 
Those skiUed in the art will recognize that various modifications, 
5 truncations, etc. can be made to the methods and genes described 
herein while not departing from the spirit and scope of the present 
invention. 

In the examples that follow, EPSPS activity in plants 
is assayed by the following method. Tissue samples were collected 

10 and immediately frozen in Uquid nitrogen. One gram of young 
leaf tissue was frozen in a mortar with Hquid nitrogen and ground 
to a fine powder with a pestie. The powder was then transferred to 
a second mortar, extraction buffer was added (1 ml /gram), and 
the sample was ground for an additional 45 seconds. The 

15 extraction buffer for Canola consists of 100 mM Tris, 1 mM EDTA. 
10 % glycerol, 5 mM DTT, 1 mM BAM, 5 mM ascorbate, 1.0 mg/ml 
BSA, pH 7.5 (4°C). The extraction buffer for tobacco consists of 100 
mM Tris, 10 mM EDTA, 35 mM KCl, 20 % glycerol. 5 mM DTT, 1 
mM BAM, 5 mM ascorbate, 1.0 mg/ml BSA, pH 7.5 (4*0. The 

20 mixture was transferred to a microfuge tube and centrifuged for 5 
minutes. The resulting supematants were desalted on spin G-50 
(Pharmacia) columns, previously equilibrated with extraction 
buffer (without BSA), in 0.25 ml aliquots. The desalted extracts 
were assayed for EPSP synthase activity by radioactive HPLC 

25 assay. Protein concentrations in samples were determined by the 
BioKad microprotein assay witii BSA as the standard. 

Protein concentrations were determined using the 
BioRad Microprotein method. BSA was used to generate a 
standard curve ranging from 2 - 24 tig. Either 800 m1 of standard 
30 or diluted sample was mixed with 200 nl of concentrated BioRad 
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Bradford reagent. The samples were vortexed and read at A(595) 
after - 6 minutes and compared to the standard curve. 

EPSPS enzyme assays contained HEPES (50 mM), 
shikimate-3-phosphate (2 mM), NH4 molybdate (0.1 mM) and KF (5 

5 mM), with or without glyphosate (0.5 or 1.0 mM). The assay mix 
(30 jil) and plant extract (10 jd) were preincubated for 1 mmute at 
25*C and the reactions were initiated by adding 14C-PEP (1 mM). 
The reactions were quenched after 3 minutes with 50 jil of 90% 
EtOH/O.lM HOAc, pH 4.5. The samples were spun at 6000 rpm 

10 and the resulting supernatants were analyzed for 14C-EPSP 
production by HPLC. Percent resistant EPSPS is calculated from 
the EPSPS activities with and without glyphosate. 

The percent conversion of 14C labeled PEP to 14C EPSP 
was determined by HPLC radioassay using a C18 guard column 

15 (Brownlee) and an AXlOO HPLC column (0.4 X 25 cm, Synchropak) 
with 0.28 M isocratic potassium phosphate eluant, pH 6.5, at 1 
ml/min. Initial velocities were calculated by multiplying 
fractional turnover per unit time by the initial concentration of tiie 
labeled substrate (1 mM). The assay was linear with time up to - 3 

20 minutes and 30% turnover to EPSPS. Samples were diluted with 
10 mM Tris, 10% glycerol, 10 inM DTT. pH 7.5 (4*»C) if necessary to 
obtain results within the linear range. 

In these assays DL-dithiotheitol (DTT), benzamidine 
(BAM), and bovine serum albumin (BSA, essentially globulin free) 

25 were obtained from Sigma. Phosphocno/pyruvate (PEP) was from 
Boehringer Mannheim and phosphoe/M>/-[l-i4C]pyruvate (28 
mCi/mmol) was from Amersham. 



30 
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TCXAMPLE 1 

Transformed tobacco plants have been generated with 
a number of the Class H EPSPS gene vectors containing the CP4 

5 EPSPS DNA sequence as described above with suitable expression 
of the EPSPS. These transformed plants exhibit glsrphosate 
tolerance imparted by the Class n CP4 EPSPS. 

Transformation of tobacco employs the tobacco leaf 
disc transformation protocol which utilizes healthy leaf tissue 

10 about 1 month old. After a 15-20 minutes surface sterilization ydth 
10% Clorox plus a surfactant, the leaves are rinsed 3 times in 
sterile water. Using a sterile paper pimch, leaf discs are punched 
and placed upside down on MS104 media (MS salts 4.3 g/l, sucrose 
30 g/l, B5 vitamins 500X 2 ml/1. NAA 0.1 mg/1, and BA 1.0 mg/1) for 

15 a 1 day preculture. 

The discs are then inoculated with an overnight 
culture of a disarmed Agrobacterium ABI strain containing the 
subject vector that had been diluted 1/5 (ie: about 0.6 OD). The 
inoculation is done by placing the discs in centrifuge tubes with 

20 the culture. After 30 to 60 seconds, the liquid is drained off and the 
discs were blotted between sterile filter paper. The discs are then 
placed upside down on MS104 feeder plates with a filter disc to 
co-culture. 

After 2-3 days of co-culture, the discs are transferred, 
25 still upside down, to selection plates with MS104 media. After 2-3 
weeks, callus tissue formed, and individual clumps are separated 
from the leaf discs. Shoots are deanly cut firom the callus when 
they are large enough to be distinguished firom stems. The shoots 
are placed on hormone-firee rooting media (MSG: MS salts 4.3 g/l, 
30 sucrose 30 g/l, and B5 vitamins 500X 2 ml/1) with selection for the 
appropriate antibiotic resistance. Root formation occurred in 1-2 
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weeks. Any leaf caUus assays are preferably done on rooted shoots 
while still sterile. Rooted shoots are then placed in soU and kept in 
a high humidity environment (ie: plastic containers or bags). The 
shoots are hardened off by gradually exposing them to ambient 
5 humidity conditions. 

pi ^T-Pssinn nf r P4 TCPg^PS rffnt^ir. in traTlRformed PlsntS 

Tobacco cells were transformed with a number of 
plant vectors containing the native CP4 EPSPS gene, and using 
10 different promoters and/or OTP's. Preliminary evidence .for 
expression of the gene was given by the ability of the leaf tissue 
from antibiotic selected transformed shoots to recallus on 
glyphosate. In some cases, glyphosate tolerant caUus was selected 
directly following transformation- The level of expression of the 
15 CP4 EPSPS was determined by the level of glyphosate tolerant 
EPSPS activity (assayed in the presence of 0.5 mM glyphosate) or by 
Western blot analysis using a goat anti-CP4 EPSPS antibody. The 
Western blots were quantitated by densitometer tracing and 
comparison to a standard curve estabUshed using purified CP4 
20 EPSPS. These data are presented as % soluble leaf protein. The 
data from a number of transformed plant lines and 
transformation vectors are presented in Table VI below. 



25 
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Vector 


Plant # 


CP4EPSPS** 




(% leaf protein) 


pMONlTllO 


25313 


0.02 


pMONlTllO 


25329 


0.04 


pMON17116 


25095 


0.02 


pMON17119 


25106 


0.09 


pMON17119 


25762 


0.09 


pMON17119 


25767 


0.03 



** Glyphosate tolerant EPSPS activity was also demonstrated in 
leaf extracts for these plants. 

15 Glyphosate tolerance has also been demonstrated at 

the whole plant level in transformed tobacco plants. In tobacco, R© 
transformants of CTP2-CP4 EPSPS were sprayed at 0.4 lb/acre 
(0.448 kg/hectare), a rate sufficient to kill control non-transformed 
tobacco plants corresponding to a rating of 3, 1 and 0 at days 7, 14 

20 and 28, respectively, and were analyzed vegetatively and 
reproductively (Table VII). 



25 



30 
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Spray rate = 0.4 lb/acre (0.448kg/hectare) 
y«>rf^r/Plant # 

5 

pMON17110/25313 
pMON17110/25329 
pMON17119/25106 

10 

* Plants are evaluated on a numerical scoring system of 
0-10 where a vegetative score of 10 represents no damage 
relative to nonsprayed controls and 0 represents a dead 
plant. Reproductive scores (Fertile) are determined at 28 
15 days after spraying and are evaluated as to whether or 

not the plant is fertile. 

TCX AMPLE 2 

20 Canola plants were transformed with the 

pMONlTllO, pMON17116, and pMON17131 vectors and a number 
of plant lines of the transformed canola were obtained which 
exhibit glj^hosate tolerance. 

25 Plant Material 

Seedlings of Brassica napus cv Westar were 
estabHshed in 2 inch (~ 5 cm) pots containing Metro Mix 350. They 
were grown in a growth chamber at 24«'C, 16/8 hour photoperiod, 
Hght intensity of 400 uEm-2sec-i (HID lamps). They were fertiUzed 

30 with Peters 20-10-20 General Purpose Special. After 2 1/2 weeks 



Vftfretative Fertile 



day7 


day 14 


day 28 




6 


4 


2 


no 


9 


10 


10 


yes 


9 


9 


10 


yes 
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they were transplanted to 6 inch (- 15 cm) pots and grown in a 
growth chamber at 15/10'*C day/night temperature, 16/8 hour 
photoperiod, Ught intensity of 800 uEm-2sec-i (HID lamps). They 
were fertilized with Peters 15-30-15 Hi-Phos Special. 

5 

Four terminal intemodes from plants just prior to 
bolting or in the process of bolting but before flowering were 
removed and surfaced sterilized in 70% v/v ethanol for 1 minute, 

10 2% w/v sodium hjrpochlorite for 20 minutes and rinsed 3 times 
with sterile deionized water. Stems with leaves attached could be 
refrigerated in moist plastic bags for up to 72 hotirs prior to 
sterilization. Six to seven stem segments were cut into 5mm discs 
with a Redco Vegetable Slicer 200 maintaining orientation of basal 

15 end. 

The Agrobacterium was grown overnight on a rotator 
at 24'*C in 2mls of Luria Broth containing 50mgA kanamycin, 
24mg/l chloramphenicol and lOOmg/1 spectinomycin. A 1:10 
dilution was made in MS (Murashige and Skoog) media giving 

20 approximately 9x108 cells per ml. This was confirmed with optical 
density readings at 660 mu. The stem discs (explants) were 
inoculated with 1.0ml of Agrobacterium and the excess was 
aspirated from the explants. 

The explants were placed basal side down in petri 

25 plates containing 1/lOX standard MS salts, B5 vitamins, 3% 
sucrose, 0.8% agar, pH 5.7, l.Omg/1 6-benzyladenine (BA). The 
plates were layered with 1.5ml of media containing MS salts, B5 
vitamins, 3% sucrose, pH 5.7, 4.0mg/l p-chlorophenoxyacetic add, 
0.005mg/l kinetin and covered with sterile filter paper. 

30 Following a 2 to 3 day co-culture, the explants were 

transferred to deep dish petri plates containing MS salts, B5 
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vitamins. 3% sucrose, 0.8% agar, pH 5.7, Img/l BA, SOOmgA 
carbenicillin, SOmgA cefotaxiine, 200 mgA kanamycin or 175mgA 
gentamicin for selection. Seven explants were placed on each 
plate. After 3 weeks they were transferred to fresh media, 5 
5 explants per plate. The explants were cultured in a growth room 
at 25*0, continuous light (Cool White). 

T;7rprf*««^'»" Assay 

After 3 weeks shoots were excised from the explants. 

10 Leaf recallusing assays were initiated to confirm modification of 
Ro shoots. Three tiny pieces of leaf tissue were placed on 
recallusing media containing MS salts, B5 vitamins, 3% sucrose, 
0.8% agar, pH 5.7, 5.0mgA BA, 0.5mgA naphthalene acetic acid 
(NAA), 500mg/l carbeniciUin, 50mg/l cefotaxime and 200mg/l 

15 kanamycin or gentamicin or 0.5mM glyphosate. The leaf assays 
were incubated in a growth room under the same conditions as 
explant culture. After 3 weeks the leaf recallusing assays were 
scored for herbicide tolerance (callus or green leaf tissue) or 
sensitivity (bleaching). 



20 



At the time of excision, the shoot stems were dipped m 
Rootone® and placed in 2 inch (- 5 cm) pots containing Metro-Mix 
350 and placed in a dosed humid environment. They were placed 
25 in a growth chamber at 24°C, 16/8 hour photoperiod. 400 
uEm-isec-2(HID lamps) for a hardening-off period of 

approximately 3 weeks. 

The seed harvested from Ro plants is Ri seed which 

gives rise to Ri plants. To evaluate the glyphosate tolerance of an 

Ro plant, its progeny are evaluated. Because an Ro plant is 

assumed to be hemizygous at each insert location, selfing results 



30 
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in ma^TTiiiTn genotypic segregation in tiie Ri. Because each insert 
acts as a dominant aUele, in the absence of linkage and assuming 
only one hemizygous insert is required for tolerance expression, 
one insert would segregate 3:1, two inserts, 15:1, three inserts 63:1, 
etc. Therefore, relatively few Ri plants need he grown to find at 

least one resistant phenot^e. 

Seed from an Ro plant is harvested, threshed, and 

dried before planting in a glyphosate spray test. Various 
techniques have been used to grow the plants for Ri spray 
evaluations. Tests are conducted in both greenhouses and growth 
chambers. Two planting systems are used; - 10 cm pots or plant 
trays containing 32 or 36 cells. Soil used for planting is either 
Metro 350 plus three types of slow release fertilizer or plant Metro 
350. Irrigation is either overhead in greenhouses or sub-irrigation 
in growth chambers. Fertilizer is applied as required in irrigation 
water. Temperature regimes appropriate for canola were 
maintained. A sixteen hour photoperiod was maintained. At the 
onset of flowering, plants are transplanted to -15 cm pots for seed 
2Q production. 

A spray "batch" consists of several sets of Ri 
progenies all sprayed on the same date. Some batches may also 
include evaluations of otiier tiian Ri plants. Each batch also 
includes sprayed and unsprayed non-transgenic genotypes 
25 representing the genotypes in the particular batch which were 
putatively transformed. Also included in a batch is one or more 
non-segregating transformed genotypes previously identified as 
having some resistance. 

Two-six plants from each individual R© progeny are 

30 not sprayed and serve as controls to compare and measure the 
gl3rphosate tolerance, as well as to assess any variability not 
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induced by the glyphosate. When the other plants reach the 2-4 
leaf stage, usually 10 to 20 days after planting, glyphosate is 
applied at rates varying from 0.28 to 1.12 kg/ha, depending on 
objectives of the study. Low rate technology using low volumes has 
been adopted. A laboratory track sprayer has been calibrated to 
deliver a rate equivalent to field conditions. 

A scale of 0 to 10 is used to rate the sprayed plants for 
vegetative resistance. The scale is relative to the unsprayed plants 
from the same Ro plant. A 0 is death, while a 10 represents no 
visible difference from the unsprayed plant. A higher nimiber 
between 0 and 10 represents progressively less damage as 
compared to the tinsprayed plant. Plants are scored at 7, 14, and 
28 days after treatment (DAT), or until bolting, and a line is given 
the average score of the sprayed plants within an plant &mily. 

Six integers are used to qualitatively describe the 
degree of reproductive damage from glyphosate: 



0: 


No floral bud development 


2: 


Floral buds present, but aborted prior to 




opening 


4: 


Flowers open, but no anthers, or anthers 




fail to extrude past petals 


6: 


Sterile anthers 


8: 


Partially sterile anthers 


10: 


Fully fertile flowers 


Plants are 


scored using this scale at or shortly after 



initiation of flowering, depending on the rate of floral structure 
development. 



TCimression o f TCPSPS in Canola 

After the 3 week period, the transformed canola 
plants were assayed for the presence of glyphosate tolerant EPSPS 
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activity (assayed in the presence of glyphosate at 0.5mM). The 
results are shown in Table VJil. 



10 



15 



20 



25 



Yft^^-nr rontrol 

pMONlTllO 

pMONlTllO 

pMONlTllO 

pMONlTllO 

pMONlTllO 

pMONlTllO 

pMONlTllO 

pMONlTllO 

pMON17116 

pMON17116 

pMON17116 

pMON17116 

pMON17116 

pMON17116 

pMON17116 

pMON17116 

pMON17116 

pMON17116 



30 



yiant# 

41 

S2 

71 

104 

172 

177 

252 

350 

40 

99 

175 

178 

182 

252 

296 

332 

383 

395 



% resistant EPSFS 
activity of leaf extract 
fat n s mM Plvohosate) 

0 

47 

28 

82 

75 

84 

85 

29* 

49 

25 

87 

94 

43 

18 

€9 

44* 

89 

97 

S2 



♦assayed in the presence of 1.0 mM glyphosate 

Ri transformants of canola were then grown in a 
growth chamber and sprayed with glyphosate at 0.56 kg/ha 
(kilogram/hectare) and rated vegetatively. These results are 
shown in Table KA - IXC. It is to be noted that expression of 
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giyphosate resistant EPSPS in aU tissues is preferred to observe 
optimal giyphosate tolerance phenotype in these transgenic plants. 
In the Tables below, only expression results obtained with leaf 
tissue are described. 



TaWft TXA qfTTiiinsnte tolpmnce in Class 



(pMONlTllO = P-E35S; pMON17116 = P-EMV35S; Rl plants; 
10 Spray rate = 0.56 kg/ha) 

Vegetative 





% resistant 


^eore 


Vector/Plant No. 


RP.9PS* 


day 7 


Control Westar 


0 


5 


pMON1711CV41 


47 


6 


pMONlTllO/Tl 


82 


6 


pMON1711(yi77 


85 


9 


pMON17116/40 


25 


9 


pMON17116/99 


87 


9 


pMON17116/175 


94 


9 


pMON17116^78 


43 


6 


pMON17116^182 


18 


9 


pMON17116/383 


97 


9 



day 14 
3 
7 
7 
ID 
9 
10 
10 

20 l,™,„,,o«^o AH 6 3 

10 
3D 



25 



30 
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tv>wa tcr nivnhosnte tolprapcff in Class n ] 

(pMON17131 = P-FMV35S; Rl plants; Spray rate = 0.84 kg/ha) 
Vector/Plant No. Vegetative score** Reproductive score 



17131/78 


10 


10 


17131/102 


9 


10 


17131/115 


9 


10 


17131/116 


9 


10 


17131/157 


9 


10 


17131/169 


10 


K) 


17131/255 


10 


va 


control Westar 


1 


0 



15 



Table IXC ra^Tinsate tnlwance in Class T EPSPS 

r.anr^1fl transformflnts 

^ (P-E35S; R2 Plants; Spray rate = 0.28 kg^a) 

Vegetative 





% resistant 


Score** 




Vector/Plant No. 


EPSPS* 


day 7 day 14 


Control Westar 


0 


4 


2 


pMON899/715 


96 


5 


6 


pMON899/744 


95 


8 


8 


pMON899/794 


86 


6 


4 


pMON899/818 


81 


7 


8 


pMON899/885 


57 


7 


6 



ort * % resistant EPSPS activity in the presence of 0.5 mM giyphosate 

** A vegetative score of 10 indicates no damage, a score of 0 is given to a dead 



plant. 
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The data obtained for the Class II EPSPS 
transformants may be compared to glyphosate tolerant Class I 
EPSP transformants in which the same promoter is used to 
express the EPSPS genes and in which the level of glyphosate 
tolerant EPSPS activity was comparable for the two types of 
transformants. A comparison of the data of pMONlTllO [in 
Table IXA] and pMON17131 [Table IXB] with that for pMON899 [in 
Table IXC; the Class I gene in pMON899 is that firom A. thaliana 
{Klee et al., 1987) in which the glycine at position 101 was changed 
to an alanine] illustrates that the Class H EPSPS is at least as good 
as that of the Class I EPSPS. An improvement in vegetative 
tolerance of Class II EPSPS is apparent when one takes into 
account that the Class H plants were sprayed at twice the rate and 
jg were tested as Ri plants. 

TiryAMPLE3 



10 



20 



Soybean plants were transformed with the pMON13640 (Figure 15) 
vector and a number of plant lines of the transformed soybean 
were obtained which exhibit glyphosate tolerance. 

Soybean plants are transformed with pMON13640 by 
the method of microprojectile injection using particle gun 
technology as described in Christou et al. (1988). The seed 
25 harvested from Ro plants is Ri seed which gives rise to Ri plants. 
To evaluate the glyphosate tolerance of an Ro plant, its progeny are 
evaluated. Because an Ro plant is assumed to be hemizygous at 
each insert location, selfing results in maximum genotypic 
segregation in the Ri. Because each insert acts as a dominant 
allele, in the absence of linkage and assuming only one 
hemizygous insert is required for tolerance expression, one insert 
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would segregate 3:1, two inserts, 15:1, three inserts 63:1, etc. 
Therefore, relatively few. Ri plants need be grown to find at least 

one resistant phenotype. 

Seed fi'om an Ro soybean plant is harvested, and dried 
before planting in a glyphosate spray test. Seeds are planted into 4 
inch (-Scm) square pots containing Metro 350. Twenty seedlings 
fi^m each Ro plant is considered adequate for testing. Plants are 
maintained and grown in a greenhouse environment. A 12.5-14 
hour photoperiod and temperatures of 30^C day and 24*^0 night is 
regulated. Water soluble Peters Pete Lite fertilizer is applied as 
needed. 

A spray "batch" consists of several sets of Ri 
progenies all sprayed on the same date. Some batches may also 
include evaluations of other than Ri plants. Each batch also 
includes sprayed and unsprayed non-transgenic genotypes 
representing the genotypes in the particular batch which were 
putatively transformed. Also included in a batch is one or more 
non-segregating transformed genotypes previously identified as 
having some resistance. 

One to two plants firom each individual Ro progeny are 

not sprayed and serve as controls to compare and measure the 
glyphosate tolerance, as well as to assess any variability not 
induced by the glyphosate. When the other plants reach the first 
trifoliate leaf stage, usually 2-3 weeks after planting, glyphosate is 
applied at a rate equivalent of 128 ozyacre (8.895kg/ha) of 
Roimdup®. A laboratory track sprayer has been calibrated to 
deliver a rate equivalent to those conditions. 

A vegetative score of 0 to 10 is used. The score is 
relative to the tmsprayed progenies firom the same R© plant. A 0 is 
death, while a 10 represents no visible difference fi^om the 
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unsprayed plant. A higher numher between 0 and 10 represents 
progressively less damage as compared to the unsprayed plant. 
Plants are scored at 7, 14, and 28 days after treatment (DAT). The 
data from the analysis of one set of transformed and control 
soybean plants are described on Table X and show that the CP4 
EPSPS gene imparts glyphosate tolerance in soybean also. 

ThbteX Olvnhosate t nl^nne in Class I EPSPS soybean 

pransfarmsnts 
(P-E35S» P-FMV35S; RO plants; Spray rate ■= 128 ozyacre) 



Veetor/Pifltit No. Vfigetative score 





day 7 


day 14 


da3L2& 


1364C/40-11 


5 


6 


7 


13640/40^ 


9 


10 


10 


13640/40-7 


4 


7 


7 


control A5403 


2 


1 


0 


controIA5403 


1 


1 


0 



The CP4 EPSPS gene may be used to select 
transformed plant material directly on media containing 
glyphosate. The ability to select and to identify transformed plant 
material depends, in most cases, on the use of a dominant 
selectable marker gene to enable the preferential and continued 
growth of the transformed tissues in the presence of a normally 
inhibitory substance. Antibiotic resistance and herbicide tolerance 
genes have been used almost exclusively as such dominant 
selectable marker genes in the presence of the corresponding 
antibiotic or herbicide. The nptll/kanamycin selection scheme is 
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probably the most frequently used. It has been demonstrated that 
CP4 EPSPS is also a. useful and perhaps superior selectable 
marker/selection scheme for producing and identifjring 
transformed plants. 

A plant transformation vector that may be used in this 
scheme is pMON17227 (Figure 16). This plasmid resembles many 
of the other plasmids described infra and is essentially composed 
of the previously described bacterial replicon system that enables 
this plasmid to replicate in coli and to be introduced into and to 
replicate in Agrobacterium^ the bacterial selectable marker gene 
(Spc/Str), and located between the T-DNA right border and left 
border is the CTP2-CP4 synthetic gene in the FMV35S promoter-E9 
3* cassette. This plasmid also has single sites for a number of 
restriction enzymes* located within the borders and outside of the 
expression cassette. This makes it possible to easily add other 
genes and genetic elements to the vector for introduction into 
plants. 

The protocol for direct selection of transformed plants 
on glyphosate is outlined for tobacco. Explants are prepared for 
pre-culture as in the standard procedure as described in Example 
1: surface sterilization of leaves from 1 month old tobacco plants 
(15 minutes in 10% clorox + surfactant; 3X dH20 washes); 

explants are cut in 0.5 x 0.5 cm squares, removing leaf edges, 
mid-rib, tip, and petiole end for uniform tissue type; explants are 
placed in single layer, upside down, on MS104 plates + 2 ml 
4C005K media to moisten surface; pre-culture 1-2 days. Explants 
are inoculated using overnight culture of Agrobacterium 
containing the plant transformation plasmid that is adjusted to a 
titer of 1.2 X IQd bacteria/ml with 4C005K media. Explants are 
placed into a centrifuge tube, the Agrobacterium suspension is 
added and the mixture of bacteria and explants is "Vortexed" on 
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Tn^^mnm setting for 25 seconds to ensure even penetration of 
bacteria. The bacteria are poured off and the explants are blotted 
between layers of dry sterile filter paper to remove excess bacteria. 
The blotted explants are placed upside down on MS104 plates + 2nil 

5 4C005K media + filter disc. Co-culture is 2-3 days. The explants 
are transferred to MS104 + Carbenicillin 1000 mg/1 + cefotaxime 
100 mg/1 for 3 days (delayed phase). The explants are then 
transferred to MS104 + glyphosate 0.05 mM + Carbenicillin 1000 
mg/1 + cefotaxime 100 mg/1 for selection phase. At 4-6 weeks 

10 shoots are cut from callus and placed on MSG + Carbenicillin 500 
mg/1 rooting media. Roots form in 3-5 days, at which time leaf r 
pieces can be taken from rooted plates to confirm glyphosate 
tolerance and that the material is transformed. 

The presence of the CP4 EPSPS protein in these 

15 transformed tissues has been confirmed by immunoblot analysis 
of leaf discs. The data firom one experiment with pMON17227 is 
presented in the following: 139 shoots formed on glyphosate fi-om 
400 explants inoculated with Agrobacterium ABI/pMON17227; 97 
of these were positive on recallusing on glyphosate. These data 

20 indicate a transformation rate of 24 per 100 explants, which makes 
this a highly efficient and time saving transformation procedure 
for plants. Similar transformation firequencies have been obtained 
with pMON17131 and direct selection of transformants on 
glyphosate with the CP4 EPSPS genes has also been shown in 

25 other plant species, including Arabidopsis, potato, tomato, cotton, 

lettuce, and sugarbeet. 

The pMON17227 plasmid contains single restriction 
enzyme recognition cleavage sites (NotI, Xhol, and BstXI ) 
between the CP4 glyphosate selection region and the lefli border of 
30 the vector for the cloning of additional genes and to facilitate the 
introduction of these genes into plants. 
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. EXAMPLES 

The CP4 EPSPS gene has also been introduced into 

5 Black Mexican Sweet (BMS) com cells with expression of the 
protein and glyphosate resistance detected in callixs. 

The backbone for this plasmid was a derivative of the 
high copy plasmid pUC119 (Viera and Messing, 1987). The 1.3Kb 
Fspl-Dral pUCllQ fragment containing the origin of replication 

10 was fused to the 1.3Kb Smal-HindUE filled fragment from pKC7 
(Rao and Rogers, 1979) which contains the neomycin 
phosphotransferase type U gene to confer bacterial kanamycin 
resistance. This plasmid was used to construct a monocot 
expression cassette vector containing the 0.6kb cauliflower mosaic 

15 virus (CaMV) 35S RNA promoter with a duplication of tiiie -90 to 
-300 region (Kay et al., 1987), an 0.8kb fragment containing an 
intron from a maize gene in the 5' untranslated leader region, 
followed by a 2>olylinker and the 3* termination sequences from the 
nopaline synthase (NOS) gene (Fraley et al,, 1983). A 1.7Kb 

20 fragment containing the 300bp chloroplast transit peptide from the 
Arabidopsis EPSP sjoithase, fused in frame to the 1,4Kb coding 
sequence for the bacterial CP4 EPSP synthase was inserted into the 
monocot expression cassette in the polylinker between the intron 
and the NOS termination sequence to form the plasmid 

25 pMON19653 (Figure 17). 

pMON19653 DNA was introduced into Black Mexican 
Sweet (BMS) cells by co-bombardment with EC9, a plasmid 
containing a sulfonylurea-resistant form of the maize acetolactate 
sjntithase gene. 2.5mg of each plasmid was coated onto ttmgsten 

30 particles and introduced into log-phase BMS cells using a 
PDS-1000 particle gun essentially as described (Klein et al., 1989). 



wo 92/04449 




PCr/US91/06148 



-69- 



Transformants are selected on MS medium containing 20ppb 
chlorsulfiiron. After initial selection on chlorsulfuron, the calli 
can be assayed directly by Western blot. Glyphosate tolerance can 
be assessed by transferring the caUi to medium containing 5mM 
5 glyphosate. As shown in Table XI. CP4 EPSPS confers glyphosate 
tolerance to corn callus. 

TM^ yr. TgCTressio « f!P4 in VM^ r^r-n Calhis ■ nMON 19653 

Line f;P4. Pvprftssion 

10 ft-jrtracted protein) 



284 


0.006 % 


287 


0.036 


290 


0.061 


295 


0.073 


299 


0.113 


309 


0.042 


313 


0.003 



20 To measure CP4 EPSPS expression in com callus, the 

foUowiiif procedure was used: BMS callus (3 g wet weight) was 
dried on filter paper (Whatman#l) under vacuum, reweighed, and 
extraction buffer (500 pl/g dry weight; 100 mM Tris, 1 mM EDTA, 
10% glycerol) was added. The tissue was homogenized with a 

25 Wheaton overhead stirrer for 30 seconds at 2.8 power setting. 
After centrifugation (3 minutes, Eppendorf microfuge), the 
supernatant was removed and the protein was quantitated (BioRad 
Protein Assay). Samples (50 ^lg/well) were loaded on an SDS 
PAGE gel (Jule, 3-17%) along with CP4 EPSPS standard (10 ng), 

30 electrophoresed, and transferred to nitrocellulose similarly to a 
previously described method (Padgette, 1987). The nitrocellulose 
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blot was probed with goat anti-CP4 EPSPS IgG, and developed with 
1-125 Protein G. The radioactive blot was visualized by 
autoradiography. Res\ilts were quantitated by densitometry on an 
LKB UltraScan XL laser densitomer and are tabulated below in 
5 Table X. 



TaWeXn. G 1vnlir>sate i^«fe«i«> in BMS Com CaflgS 



10 Vector 



Experiment cTilnrsnl far on- 

resistant lines 



# cross-resistant 
tn Glvphosate 



19653 253 
19653 254 
15 EC9 control 253/254 



120 

80 

8 



81/120 = 67.5% 
37/80 = 46% 
0/8 = 0% 



Improvements in the expression, of Class I EPSPS 
coiild also be adiieved by expressing the gene using stronger plant 
promoters, using better 3* polyadenylation signal sequences, 

20 optimizing the sequences around the initiation codon for ribosome 
loading and translation initiation, or by combination of these or 
other expression or regulatory sequences or factors. It would also 
be beneficial to transform the desired plant with a Class I EPSPS 
gene in corjxmction with another glyphosate tolerant EPSPS gene 

25 or a gene capable of degrading ^yphosate in order to enhance the 
glyphosate tolerance of the transformed plant. 

From the foregoing, it will be seen that this invention 
is one well adapted to attain all the ends and objects hereinabove 
set forth together with advantages which are obvious and which 

30 are inherent to the invention. 
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It will be understood that certain features and 
subcombinations are of utility and may be employed without 
reference to other features and subcombinations. This is 
contemplated by and is within the scope of the daims. 
5 Since many possible embodiments may be made of the 

invention without departing from the scope thereof, it is to be 
understood that all matter herein set forth or shown in the 
accompanying drawings is to be interpreted as illustrative and not 
in a limiting sense. 

10 

EXAMPLE 6 

The LBAA Class 11 EPSPS gene has been introduced 
into plants and also imparts glyphosate tolerance. Data on tobacco 
15 transformed with pMON17206 (infra) are presented in Table XIII. 



T>,W<> YTTT ■ Tnlianco favuhftsate Snrav Test 
fpTVrOTSn720e t TCS-SS . rTPg.TJRa» KPSPSr 0.4 Ibs/ac) 

2Q Line 7 Day Bating 

33358 9 

34586 9 

33328 9 

34606 9 

33377 9 

34611 10 

25 34607 30 

34601 9 

34589 9 

Samsum 4 
(Control) 



30 
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SEQUEKCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: Barry, Gerard F. 

Kiahore, Ganesh K. 
Padgett e, Stephen R. 

(ii) TITLE OF INVENTION: Glyphoaate Tolerant 

5-Enolpyruvylshikiinate-3-Pho8phate Synthases 

(ill) NtJMBER OF SEQUENCES: 36 

(Iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Dennis R. Hoerner, Jr., Monsanto Co. BB4F 

(B) STREET: 700 Che&ter field Village Parkway 

(C) CITY: St. Louis 

(D) STATE: Missouri 

(E) COUNTRY: USA 

(F) ZIP: 63198 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentin Release /l.O, Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US 07/576537 

(B) FILING DATE: 31-AUG-1990 

(C) CLASSIFICATION: 

(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Hoerner Jr., Dennis R. 

(B) REGISTRATION NUMBER: 30,914 

(C) REFERENCE/DOCKET NUMBER: 38-21 {10535) 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (314)537-6099 

(B) TELEFAX: (314)537-6047 

(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 597 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
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(xl) SEQUENCE beSCRIPTION: SEQ ID NO:l: 



A & AT* 
TWATCAAAA X 


XTTTAGCAGC 


ATTCCAGATT 


GGGTTCAATC 


AACAAGGTAC 


GAGCCATATC 


60 


ACXXTAx 


AAX XA*uxnxv# 


G CC AAAACCA 


AGAAGGAACT 


CCCATCCTCA 


AAGGTTTGTA 


120 


AGGAAGAATT 


CTCAG XC C AA 


AVv WVo X WfWlW** 


AGGTCAGG6T 


ACAGAGTCTC 


CAAACCATTA 


180 


G CC AAAAGCT 


ACACKvAGAXC* 


B.&*rAA&G&AT 
#Wi X ^ AnuA/\X 


CTTCAATCAA 


AGTAAACTAC 


TGTTCCAGCA 


240 


CATCCATCAT 


GGTCAGTAAG 


TTTCAGAAAA 


AGACATCCAC 


CGAAGACTTA 


AAGTTAGTGG 


300 


GCATCTTTGA 


AAGTAATCTT 


GTCAACATCG 


AGCAGCTGGC 


TTGTGGGGAC 


CAGACAAAAA 


360 


AGGAATGGTG 


CAGAATTGTT 


AGGCGCACCT 


ACCAAAAGCA 


TCTTTGCCTT 


TATTGCAAAG 


420 


ATAAAGCAGA 


TTCCTCTAGT 


ACAAGTGGGG 


AACAAAATAA 


CGTGGAAAAG 


AGCTGTCCTG:? 


480 


ACAGCCCACT 


CACTAATGCG 


TATGACGAAC 


GCAGXGACGA 


CCACAAAAGA 


ATTCCCTCTA 


540 


TATAAGAAGG 


CATTCATTCC 


CATTTGAAGG 


ATCATCAGAT 


ACTAACCAAT 


ATTTCTC 


597 



(2) INFORMATION FOR SEQ ID NO: 2s 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1982 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 62*. 1426 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

AAGCCCGCGT TCTCTCCGGC GCTCCGCCCG GAGAGCCGTG GATAGATTAA GGAAGACGCC 60 

C ATG TCG CAC GGT GCA AGC AGC CGG CCC GCA ACC GCC CGC AAA TCC 106 
Met Ser His Gly Ala Ser Ser Arg Pro Ala Thr^ Ala Arg Lys Ser 
15 10 15 

TCT GGC CTT TCC GGA ACC GTC CGC ATT CCC CGC GAC AAG TCG ATC TCC 154 
Ser Gly Leu Ser Gly Thr Val Arg lie Pro Gly Asp Lys Ser lie Ser 

20 25 30 

CAC CGG TCC TTC ATG TTC GGC GGT CTC GCG AGC GGT GAA ACG CGC ATC 202 
His Arg Ser Phe Met Phe Gly Gly Leu Ala Ser Gly Glu Thr Arg lie 
35 40 45 
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ACC GGC CTT CTG GAA GGC GAG GAC GTC ATC AAT ACG 6CC AAG GCC ATG 250 
Thr Gly Leu Leu Glu Gly Glu Asp Val He Asn Thr Gly Lys Ala Met 
50 S5 60 

CAG GCC ATG GGC GCC A6G ATC CGT AAG GAA GGC GAC ACC TGG ATC ATC 298 
cm Ala Met Gly Ala Arg He Arg Lya Glu Gly Asp Thr Trp He He 
65 70 75 

GAT GGC GTC GGC AAT GGC GGC CTC CTG 6CG CCT GAG GCG CCG CTC GAT 346 
Asp Gly val Gly Asn Gly Gly Leu Leu Ala Pro Glu Ala Pro Leu Asp 



80 



85 



TTC GGC AAT GCC GCC ACG GGC TGC CGC CTG ACC ATG GGC CTC GTC GGG 394 
Phe Gly Asn Ala Ala Thr Gly Cys Arg Leu Thr Met Gly Leu Val Gly 
100 105 110 

GTC TAC GAT TTC GAC AGC ACC TTC ATC GGC GAC GCC TCG^CTC ACA AAG 442 
Val Tyr Asp Phe Asp Ser Thr Phe He Gly Asp Ala Ser Leu Thr Lys 
115 120 125 

CGC CCG ATG GGC CGC GTG TTO AAC CCG CTG CGC GAA ATG CGC GTG CAG 490 
Arg Pro Met Gly Arg Val Leu Asn Pro Leu Arg Glu Met Gly Val Gin 
130 135 140 

GTG AAA TC6 GAA GAC GGT GAC CGT CTT CCC GTT ACC TTG CGC GGG CCG 538 
Val Lys Ser Glu Asp Gly Asp Arg Leu Pro Val Thr Leu Arg Gly Pro 
145 150 155 

AAG ACG CCG ACG CCG ATC ACC TAC CGC GTG CCG ATG GCC TCC GCA CAG 586 
Lys Thr Pro Thr Pro He Thr Tyr Arg Val Pro Met Ala Ser Ala Gin 
160 165 170 175 

GTG AAG TCC GCC GTG CTG CTC GCC GGC CTC AAC ACG CCC GGC ATC ACG 634 
Val Lys Ser Ala Val Leu Leu Ala Gly Leu Asn Thr Pro Gly He Thr 



180 



185 190 



ACG GTC ATC GAG CCG ATC ATG ACG CGC GAT CAT ACG GAA AAG ATG CTG 682 
Thr Val He Glu Pro He Met Thr Arg Asp His Thr Glu Lys Met Leu 
195 200 205 

CAG GGC TTT GGC GCC AAC CTT ACC GTC GAG ACG GAT GCG GAC GGC GTG 730 
Gin Gly Phe Gly Ala Asn Leu Thr Val Glu Thr Asp Ala Asp Gly Val 
210 215 220 

CGC ACC ATC CGC CTG GAA GGC CGC GGC AAG CTC ACC GGC CAA GTC ATC 778 
Arg Thr He Arg Leu Glu Gly Arg Gly Lys Leu Thr Gly Oln Val He 
225 230 235 

GAC GTG CCG GGC GAC CCG TCC TCG ACG GCC TTC CCG CTG GTT GCG GCC 826 
Asp Val Pro Gly Asp Pro Ser Ser Thr Ala Phe Pro Leu Val Ala Ala 
240 245 250 255 

CTG CTT GTT CCG GGC TCC GAC GTC ACC ATC CTC AAC GTG CTG ATG AAC 874 
Leu Leu Val Pro Gly Ser Asp Val Thr He Leu Asn Val Leu Met Asn 
260 265 270 
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CCC ACC CGC ACC CGC CTC ATC CTG ACG CTG CAG GAA ATG GGC GCC CAC 
Pro Thr Arg Thr Gly Leu lie I«u Thr I«u Gin Glu Met Gly Ala Asp 
275 280 285 

ATC GAA CTC ATC AAC CCG CGC CTT GCC GGC GGC GAA GAC GTG GCG GAC 
He Glu Val He Asn Pro Arg Leu Ala Gly Gly Glu Asp Val Ala Aep 
290 295 300 

CTG CGC GTT CGC TCC TCC ACG CTG AAG GGC GTC ACG GTG CCG GAA GAC 
Leu Arg Val Arg Ser Ser Thr Leu Lys Gly Val Thr Val Pro Glu Asp 
305 310 315 

CGC GCG CCT TCG ATG ATC GAC GAA TAT CCG ATT CTC GCT GTC GCC GCC 
Ara Ala Pro Ser Met He Asp Glu Tyr Pro He Leu Ala Val Ala Ala 
320 325 330 335 

GCC TTC GCG GAA CGG GCG ACC GTG ATG AAC GGT CTG GAA GAA CTC CGC 
Ala Phe Ala Glu Gly Ala Thr Val Met Asn Gly Leu Glu Glu Leu Arg 
340 345 350 

GTC AAG GAA AGC GAC CGC CTC TCG GCC GTC GCC AAT GGC CTC AAG CTC 

Val Lys Glu ser Asp Arg Leu Ser Ala Val Ala Asn Gly Leu Lys Leu 
355 360 365 

AAT GGC GTG GAT TGC GAT GAG GGC GAG ACG TCG CTC GTC GTG CGC GGC 
Asn Gly Val Asp Cys Asp Glu Gly Glu Thr Ser Leu Val Val Arg Gly 
370 375 380 

CGC CCT CAC GGC AAG GGG CTC GGC AAC GCC TCG GGC GCC GCC GTC GCC 
Arg Pro Asp Gly Lys Gly Leu Gly Asn Ala Ser Gly Ala Ala Val Ala 
385 390 395 

ACC CAT CTC GAT CAC CGC ATC GCC ATG AGC TTC CTC GTC ATG GGC CTC 
Thr His Leu Asp His Arg He Ala Met Ser Phe Leu Val Met Gly Leu . „ 
400 405 410 415 

GTG TCG GAA AAC CCT GTC ACG GTG GAC GAT GCC ACG ATG ATC GCC ACG 
Val Ser Glu Asn Pro Val Thr Val Asp Asp Ala Thr Met He Ala Thr 
420 425 430 

AGC TTC CCG GAG TTC ATG GAC CTG ATG GCC GGG CTG GGC GCG AAG ATC 
ser Phe Pro Glu Phe Met Asp Leu Met Ala Gly Leu Gly Ala Lys He 
435 440 445 

GAA CTC TCC GAT ACG AAG GCT GCC TGAT6ACCTT CACAATCGCC ATCGATGGTC 
Glu Leu Ser Asp Thr Lys Ala Ala 
450 455 

CCGCTGC6GC CG6CAAGGGG ACGCTCTCGC GCCGTATCGC GGAGGTCTAT GGCTTTCATC 

ATCTCGATAC GGGCCTGACC TATCGCGCCA CGGCCAAAGC GCTGCTCGAT CGCGGCCTGT 

CGCTTGATGA C6AGGCG6TT GCGGCCGATG TCGCCCGCAA TCTCGATCTT GCCGGGCTCG 

ACCGGTCGGT GCTGTCGGCC CATGCCATCG GCGAGGCGGC TTCGAAGATC GCGGTCATGC 

CCTCGGTCCG GCGGGCGCTG CTCGAGGCGC AGCGCAGCTT TGCGGCGCGT GAGCCCGCCA 



922 
970 
1016 
1066 
1114 
1162 
1210 
1258 
1306 
1354 
1402 
1456 

1516 
1576 
1636 
1696 
1756 



1982 
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CGGTGCTGGA TCGACGCGAT ATCGGCAC»G TGGTCTGCCC GGATCOGCCG GTGAAGCTCT 1816 
ATGTCACCGC GTCACCGGAA 6TGCGCGCGA AACGCCGCTA T6ACGAAATC CTCGGCAATG 1876 
GCGGGTTGGC C6ATTACGGG ACGATCCTCC AGGATATCCG CCX5CCGCGAC GAGCGGGACA 1936 
TGGGTC6GGC GGACAGTCCT TTGAAGCCCG CCGACGATGC GCACTT 

(2) INFORMATION FOR SEQ ID NO: 3: 

(t) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 455 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

Met ser His Gly Ala Ser Ser Arg Pro Ala Thr Ala Arg Lys Ser Ser 
1 S 10 IS 

GlY Leu Ser Gly Thr Val Arg lie Pro Gly Asp Lys Ser lie Ser His 
20 25 30 

Arg ser Phe Met Phe Gly Gly Leu Ala Ser Gly Glu Thr Arg He Thr 
35 40 45 

Gly Leu Leu Glu Gly Glu Asp Val He Asn Thr Gly Lys Ala Met Gin 
50 55 60 

Ala Met Gly Ala Arg He Arg Lys Glu Gly Asp Thr Trp He He Asp 
65 70 75 80 

Gly Val Gly Asn Gly Gly Leu Leu Ala Pro Glu Ala Pro Leu Asp Phe 

85 90 95 

Gly Asn Ala Ala Thr Gly Cys Arg Leu Thr Met Gly Leu Val Gly Val 
100 105 110 

Tyr Asp Phe Asp Ser Thr Phe He Gly Asp Ala Ser Leu Thr Lys Arg 
lis 120 125 

Pro Met Gly Arg Val Leu Asn Pro Leu Arg Glu Met Gly Val Gin Val 
130 135 140 

Lye Ser Glu Asp Gly Asp Arg Leu Pro Val Thr Leu Arg Gly Pro Lye 
lis 150 155 160 

Thr Pro Thr Pro Ho Thr Tyr Arg Val Pro Met Ala Ser Ala Gin Val 
165 170 175 

Lys Ser Ala Val Leu Leu Ala Gly Leu Asn Thr Pro Gly He Thr Thr 
180 185 190 
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Val He Glu Pro lie Mot Thr Arg Aop Hie Thr Glu Lya Met I^u Gin 
195 200 205 

Gly Phe Gly Ala Aen Leu Thr Val Glu Thr Asp Ala Asp Gly val Arg 
210 215 220 



Thr He Arg Leu Glu Gly Arg Gly Lys Leu Thr Gly Gin Val He Asp 
225 



230 235 240 



val Pro Gly Asp Pro Ser Ser Thr Ala Phe Pro Leu Val Ala Ala Leu 
245 250 255 

Leu val pro Gly Ser Asp Val Thr He Leu Asn Val Leu Met Asn Pro 
260 265 270 

Thr Arg Thr Gly Leu He Leu Thr Leu Gin Glu Met Gly Ala Asp He 
275 280 285 

Glu Val He Asn Pro Arg Leu Ala Gly Gly Glu Asp Val Ala Asp Leu 
290 295 300 

Arg val Arg Ser Ser Thr Leu Lys Gly Val Thr Val Pro Glu Asp Arg 
305 310 315 320 

Ala pro ser Met He Asp Glu Tyr Pro He Leu Ala Val Ala Ala Ala 
325 330 335 

Phe Ala Glu Gly Ala Thr Val Met Asn Gly Leu Glu Glu Leu Arg Val 
340 345 350 

Lys Glu ser Asp Arg Leu Ser Ala Val Ala Asn Gly Leu Lys Leu Asn 
355 360 365 

Gly val Asp Cye Asp Glu Gly Glu Thr Ser Leu Val Val Arg Gly Arg 
370 375 380 

pro ASP Gly Lys Gly Leu Gly Asn Ala Ser Gly Ala Ala Val Ala Thr 
385 390 395 400 

His Leu Asp His Arg He Ala Met Ser Phe Leu Val Met Gly Leu Val 
405 410 415 

ser Glu Asn Pro Val Thr Val Asp Asp Ala Thr Met He Ala Thr Ser 
420 425 430 

Phe Pro Glu Phe Met Asp Leu Met Ala Gly Leu Gly Ala Lys He Glu 
435 440 445 

Leu Ser Asp Thr Lys Ala Ala 
450 455 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1673 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
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(D) TOPOIXXSY: linear 
(li) MOLECULE TYPE: DKA (genomic) 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 86*. 1432 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
GTAGCCACAC ATAATTACTA TAGCTAGGAA GCCCGCTATC TCTCAATCCC GCGTGATCGC 60 

GCCAAAATGT GACTGTGAAA AATCC ATG TCC CAT TCT GCA TCC CCG AAA CCA 112 

Met Ser Hie Ser Ala Ser Pro Lye Pro 
1 5 

GCA ACC 6CC CGC CGC TCG GAG GCA CTC ACG GGC GAA ATC CGC ATT CCG 160 
Ala Thr Ala Arg Arg Ser Glu Ala Leu Thr Gly Glu lie Arg lie Pro 
10 15 20 25 

GGC GAC AAG TCC ATC TCG CAT CGC TCC TTC ATG TTT GGC GGT CTC GCA 208 
Gly Asp Lys Ser lie Ser His Arg Ser Phe Met Phe Gly Gly Leu Ala 

30 35 40 

TCG GGC GAA ACC CGC ATC ACC GGC CTT CTG GAA GGC GAG GAC GTC ATC 256 
Ser Gly Glu Thr Arg lie Thr Gly Leu Leu Glu Gly Glu Asp Val lie 
45 50 55 

AAT ACA GGC CGC GCC ATG CAG GCC ATG GGC GCG AAA ATC CGT AAA GAG 304 
Xsn Thr Gly Arg Ala Met Gin Ala Met Gly Ala Lys lie Arg Lys Glu 
60 65 70 

GGC GAT GTC TGG ATC ATC AAC GGC GTC GGC AAT GGC TGC CTG TTG CAG 352 
Gly Asp Val Trp He He Asn Gly Val Gly Asn Gly Cys Leu Leu Gin 
75 80 85 



CCC GAA GOT GCG CTC GAT TTC GGC AAT GCC 6GA ACC GGC GCG CGC CTC 
Pro Glu Ala Ala Leu Asp Phe Gly Asn Ala Gly Thr Gly Ala Arg Leu 
90 95 100 105 



400 



ACC ATG GGC CTT GTC GGC ACC TAT GAC ATG AAG ACC TCC TTT ATC GGC 448 
Thr Met Gly Leu Val Gly Thr Tyr Asp Met Lye Thr Ser Phe He Gly 
110 115 120 

GAC GCC TCG CTG TCG AAG CGC CCG ATG GGC CGC GTG CTG AAC CCG TTG 496 
Asp Ala Ser Leu Ser Lys Arg Pro Met Gly Arg Val Leu Asn Pro Leu 
125 130 135 



CGC GAA ATG GGC GTT CAG GTG GAA GCA GCC CAT CGC GAC CGC ATC CCG 
Arg Glu Met Gly Val Gin Val Glu Ala Ala Asp Gly Asp Arg Met Pro 
140 145 150 



544 
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CTG ACG CTG AIC GGC ceo AAG ACC CCC AAT COG ATC ACC TAT CGC GTG 
in aly pro I^ys Thr Ala Aan Pro lie Thr Tyr Arg Val 

155 

CCG ATG CCC TCC CCG CAG GTA AAA TCC GCC CTG CTG CTC GCC GGT CTC 
Hel Ala Ser Ala Gin Val I-ye Ser Ala Val Leu Leu Ala Gly Leu 
170 "5 

AAC ACG CCG GGC GTC ACC ACC 6TC ATC GAG CCG CTC ATG ACC CGC GAC 
rTr Iro Gly Val Thr Thr Val He Glu Pro Val Met Thr Arg Asp 
190 "5 

CAC ACC GAA AAG ATG CTG CAG GGC TTT GGC GCC GAC CTC ACG GTC GAG 
Sb Jhr Met Leu Gin Gly Phe Gly Ala Asp Leu Thr Val Glu 

205 210 21& 

ACC GAC AAG GAT GGC GTG CGC CAT ATC CGC ATC ACC GGC CAG. GGC AAG 
?hr ASP ASP Gly Val Arg His He Arg He Thr Gly Gin Gly Lys 
220 225 

CTT GTC GGC CAG ACC ATC GAC GTG CCG GGC GAT CCG TCA TCG ACC GCC 
?eu val Gly Gin Thr He Asp Val Pro Gly Asp Pro ser Ser Thr Ala 



235 



240 245 



TTC CCG CTC 6TT GCC GCC CTT CTG GTG GAA GGT TCC GAC GTC ACC ATC 
lit pro Leu val Ala Ala Leu Leu Val Glu Gly Ser Asp Val Thr He 
250 255 260 265 

CGC AAC GTG CTG ATG AAC CCG ACC CGT ACC GGC CTC ATC CTC ACC TTO 
Arg Asn Val Leu Met Asn Pro Thr Arg Thr Gly Leu He Leu Thr Leu 



270 



CAG GAA ATG GGC GCC GAT ATC GAA GTG CTC AAT GCC CGT CTT GCA GCC 
tin llu Met Gly Ala Asp He Glu Val Leu Asn Ala Arg Leu Ala Gly 
285 290 295 

GGC GAA GAC GTC GCC GAT CTG CGC GTC AGG GCT TCG AAG CTC AAG GCC 
Gly Glu Asp val Ala Asp Leu Arg Val Arg Ala Ser Lys Leu Lys Gly 
300 305 310 

GTC GTC GTT CCG CCG GAA CGT GCG CCG TCG ATG ATC GAC GAA TAT CCG 
vll ?al val pro Pro Glu Arg Ala Pro Ser Met He Asp Glu Tyr Pro 
315 320 325 

GTC CTG GCG ATT GCC GCC TCC TTC GCG GAA GGC GAA ACC GTG ATG GAC 
lit IT. iZ Ala Ala Ser Phe Ala Glu Gly Glu Thr Val Met Asp 

^^e 340 ^49 

330 



592 



640 



688 



736 



784 



832 



880 



928 



976 



1024 



1072 



1120 



1168 



GGG CTC GAC GAA CTG CGC GTC AAG GAA TCG GAT CGT CTG GCA GCG GTC 

Gly I-eu ABp Glu Leu Arg Val Lys Glu Ser Asp Arg Leu Ala Ala Val 
350 355 360 

GCA CGC GGC CTT GAA GCC AAC GGC GTC GAT TGC ACC GAA GGC GAG ATG 1216 
Ala Arg Gly Leu Glu Ala Aen Gly Val Asp Cys Thr Glu Gly Glu Met 
365 370 375 
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380 385 

ACC GTT GCA ACC CAT CTC GAT CAT COT ATC GCG ATG AGC TTC CTC GT6 
5S ihr Sb Leu ABP His Arg He Ala Met Ser Phe I.eu Val 
395 

ATG GCC CTT GCG GCG GAA AAG CCG GT6 ACG GTT GAC GAC A6T AAC ATG 
S Ala Glu Lye Pro Val Thr Val Asp Aep Ser Asn Met 
410 *20 

r-nn aec TCC TTC CCC GAA TTC ATG GAC ATG ATG CCG GGA TTG GGC 
ill ThS P« Olu Phe H.t JSP He. H« Pro Oly Leu Cly 

430 *35 

GCA AAG ATC GAG TTG AGC ATA CTC TAGTCACTCG ACAGCGAAAA TATTATTTGC 
Ala Lye He Glu I.eu Ser He Leu 
445 

GAGATTGGGC ATTATTACCG GTTGGTCTCA GCGGGGGTTT AATGTCCAAT CTTCCATACG 
TAACAGCATC AGGAAATATC AAAAAAGCTT TAGAAGGAAT TGCTAGAGCA GCGACGCCGC 
CTAAGCTTTC TCAAGACTTC 6TTAAAACTG TACTGAAATC COGGGGGGTC CGGGGATCAA 
ATGACTTCAT TTCTGAGAAA ITGGCCTCGC A 

(2> INFORMATION FOR SEQ ID NOs5» 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 449 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY-: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Met ser His Ser Ala Ser Pro Lys Pro Ala Thr Ala Arg Arg Ser Glu 
1 5 10 



Ala Leu Thr Gly Glu He Arg He Pro Gly Asp Lys Ser He Ser His 
20 



25 30 



Arg ser Phe Met Phe Gly Gly Leu Ala Ser Gly Glu Thr Arg He Thr 
35 40 45 

Gly Leu Leu Glu Gly Glu Asp Val He Asn Thr Gly Arg Ala Met Gin 
50 55 60 

Ala Met Gly Ala Lys He Arg Lye Glu Gly Asp Val Trp He He Asn 



65 



70 75 80 



Gly val Gly Aen Gly Cys Leu Leu Gin Pro Glu Ala Ala Leu Asp Phe 
85 90 95 



1264 

1312 

1360 

1408 

1462 

1522 
1582 
1642 
1673 
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Cly Aan Ala Gly Thr Gly AX. Arg ^eu Thr Met Gly Leu Val Gly Thr 
100 

Tyr ASP Met I-ya Thr Ser Phe lie Gly Asp Ala Ser Leu Ser Lys Arg 
115 

pro Met Gly Arg Val Leu Asn Pro Leu Arg Glu Met Gly Val Gin Val 



130 135 
Glu Ala Ala Aep Gly Asp Arg Met Pro Leu Thr Leu lie Gly Pro Lys 

145 

Thr Ala Aan Pro lie Thr Tyr Arg Val Pro Met Ala Ser Ala Gin Val 
165 

I.yB ser Ala Val Leu Leu Ala Gly Leu Asn Thr Pro Gly Val Thr Thr ^ 
180 

val lie Glu Pro Val Met Thr Arg Asp His Thr Glu Lya Met Leu Gin 

Gly Phe Gly Ala Asp Leu Thr Val Glu Thr Asp Lys Asp Gly Val Arg 

210 215 
His lie Arg He Thr Gly Gin Gly Lys Leu Val Gly Gin Thr He Asp 

val pro Gly Asp Pro Ser ser Thr Ala Phe Pro Leu Val Ala Ala Leu 



245 

Leu val Glu Gly Ser Asp val Thr He Arg Asn Val Leu Met Asn Pro 
260 265 270 



Thr Arg Thr Gly Leu He Leu Thr Leu Gin Glu Met Gly Ala Asp He 
val Leu Asn Ala Arg Leu Ala Gly Gly Glu Asp Val Ala Asp Leu 



Glu 



290 



295 



Arg val Arg Ala Ser Lys Leu Lye Gly Val Val Val Pro Pro Glu Arg 
305 310 315 

Ala Pro ser Met He Asp Glu Tyr Pro Val Leu Ala He Ala Ala Ser 

330 



325 



Phe 



Ala Glu Gly Glu Thr Val Met Asp Gly Leu Asp Glu Leu Arg Val 



340 



345 



Lya Glu ser Asp Arg Leu Ala Ala Val Ala Arg Gly Leu Glu Ala Asn 
355 360 365 

Gly val Asp Cya Thr Glu Gly Glu Met Ser Leu Thr Val Arg Gly Arg 
370 375 380 

pro Aep Gly Lya Gly Leu Gly Gly Gly Thr Val Ala Thr Hia Leu Aap 

385 390 
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r«., w«i Mat GlY Ala Ala Clu I.ye 

HlB Arg lie Ala Mot Ser Phe Leu Val Met Giy 

405 



« « u^v Tie Ala Thr Ser Phe Pro Glu 
Pro val Thr Val Asp Asp Ser Aen Met lie Ala 

420 

,.e M.. «P Met M.t Pr. OXy «u cly «. ^y. "J- 

435 

Leu 



2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTHS 1500 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY s linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 34.. 1380 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: 
OTOATOOC^C C««TOTO» CTOTOA»» ^ IS IS S^o 



54 

Ser Pro 

1 5 



^re CCC CGC CGC TCC GAG GCA CTC ACG GGC GAA ATC CGC 

^ ITo IT. Tla Sg ser Glu Ala .eu Thr Gly Glu Xle Arg 
10 

S: »e IS "I S IS IS JTe Sy ry 

sstsiss^o-?ss^":isi!^s!^„i?;ryr„rp 

40 

^^r^ n^r GCC ATG GGC GCG AAA ATC CGT 

?S JS cfy S^ SS S^n ^.S Se. O.y .y. XXe 

60 *5 

- z -I ^ 'z z s= I J 
IS s: ir.f. r. ^p r.: «J ^« IS ?s ^y 

90 ^5 



CGC ,102 



246 



342 
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CGC CTC ACC ATG GGC CTT CTC CGC ACC TAT CAC ATG AAG ACC TCC TTT 390 
' Arg Leu Thr Met Gly Leu Val Cly Thr Tyr Asp Met Lya Thr Ser Phe 
105 110 

ATC GCC GAC GCC TCG CTG TOG AAG CGC CCG ATG GGC CGC GTG CTG AAC 438 

lie Gly Asp Ala Ser Leu Ser Lye Arg Pro Met Cly Arg Val Leu Asn 
120 125 130 135 

CCG TTG CGC CAA ATG GGC CTT CAG GTG GAA CCA GCC GAT GGC GAC CGC 486 
Pro Leu Arg Clu Met Gly Val Gin Val Glu Ala Ala Asp Gly Asp Arg 
140 145 150 

ATG CCG CTG ACG CTG ATC GGC CCG AAG ACC GCC AAT CCG ATC ACC TAT 534 
Met Pro Leu Thr Leu lie Gly Pro Lye Thr Ala Asn Pro He Thr Tyr 
155 160 165 

CGC GTG CCG ATG GCC TCC GCG CAG GTA AAA TCC GCC GTO CTG CTC GCC 582 
Arg Val Pro Met Ala Ser Ala Gin Val Lys Ser Ala Val Leu Leu Ala 
170 175 180 

GGT CTC AAC ACG CCG GGC GTC ACC ACC CTC ATC GAG CCG GTC ATG ACC 630 
Gly Leu Asn Thr Pro Gly Val Thr Thr Val lie Glu Pro Val Met Thr 
185 190 195 

CGC CAC CAC ACC GAA AAG ATG CTG CAG CGC TTT GCC GCC GAC CTC ACG 678 
Arg Asp His Thr Glu Lys Met Leu Gin Gly Phe Gly Ala Asp Leu Thr 
200 205 210 215 

CTC GAG ACC GAC AAG GAT GGC GTC CGC CAT ATC CGC ATC ACC GGC CAG 726 
Val Glu Thr Asp Lys Asp Gly Val Arg His lie Arg He Thr Gly Gin 
220 225 230 

GGC AAG CTT GTC GGC CAG ACC ATC GAC GTC CCG GGC GAT CCG TCA TCC:^;i 774 
Cly Lys Leu Val Gly Gin Thr He Asp Val Pro Gly Asp Pro Ser Ser 
235 240 245 

ACC GCC TTC CCG CTC CTT GCC GCC CTT CTG GTG CAA GGT TCC CAC GTC 822 
Thr Ala Phe Pro Leu Val Ala Ala Leu Leu Val Clu Cly Ser Asp Val 
250 255 260 

ACC ATC CGC AAC GTG CTG ATG AAC CCG ACC CGT ACC CGC CTC ATC CTC 870 
Thr He Arg Asn Val Leu Met Asn Pro Thr Arg Thr Gly Leu He Leu 
265 270 275 

ACC TTC CAG GAA ATG GGC GCC GAT ATC CAA GTG CTC AAT GCC CGT CTT 918 
Thr Leu Gin Clu Met Gly Ala Asp He Clu Val Leu Asn Ala Arg Leu 
280 285 290 295 

CCA GGC GGC GAA GAC GTC GCC GAT CTG CGC GTC ACG GCT TCG AAG CTC 966 
Ala Gly Gly Glu Asp Val Ala Asp Leu Arg Val Arg Ala Ser Lys Leu 
300 305 310 

AAC GGC GTC CTC CTT CCG CCG CAA CGT COG CCG TCG ATG ATC CAC GAA 1014 
Lys Cly Val Val Val Pro Pro Clu Arg Ala Pro Ser Met He Asp Clu 
315 320 325 
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TAT CCG GTC CTG GCXS AIT GCC GCC TCC TTC GCG GAA GGC GAA ACC GTC 
Tyr Pro Val Leu Ala lie Ala Ala Ser Phe Ala Glu Gly Glu Thr Val 
330 335 340 

ATG GAC GGG CTC GAC GAA CTG 6gC GTC AAG GAA TCG GAT CGT CTG GCA 
Met Asp Gly Leu Asp Glu Leu Arg Val Lye Glu Ser Asp Arg Leu Ala 
345 350 355 

GCG GTC GCA CGC GGC CTT GAA GCC AAC GGC GTC GAT TGC ACC GAA GGC 
Ala val Ala Arg Gly Leu Glu Ala Aen Gly Val Asp Cys Thr Glu Gly 
360 365 370 375 

GAG ATG TCG CTG ACG GTT CGC GGC CGC CCC GAC GGC AAG GGA CTG GGC 
Glu Met ser Leu Thr Val Arg Gly Arg Pro Asp Gly Lys Gly Leu Gly 
380 385 390 

GGC GGC ACG GTT GCA ACC CAT CTC GAT CAT CGT ATC GCG ATG AGC TTC 
Gly Gly Thr Val Ala Thr His Leu Asp Hie Arg lie Ala Met Ser Phe 
395 400 405 

CTC GTG ATG GGC CTT GCG GCG GAA AAG CCG GTG ACG GTT GAC GAC ACT 
Leu val Met Gly Leu Ala Ala Glu Lys Pro Val Thr Val Asp Asp Ser 
410 415 420 

AAC ATG ATC GCC ACG TCC TTC CCC GAA TTC ATG GAC ATG ATG CCG GGA 
Asn Met He Ala Thr Ser Phe Pro Glu Phe Met Asp Met Met Pro Gly 
425 430 435 

TTG GGC GCA AAG ATC GAG TTG AGC ATA CTC TAGTCACTCG ACAGCGAAAA 
Leu Gly Ala Lys He Glu Leu Ser He Leu 
440 445 

TATTATTTGC GAGATTGGGC ATTATTACCG GTTGGTCTCA GCGGGGGTTT AATGTCCAAT 
CTTCCATACG TAACAGCATC AGGAAATATC AAAAAAGCTT 



1062 

1110 

1158 

1206 

1254 

1302 

1350 

1400 

1460 
1500 



(2) INFORMATION FOR SEQ ID NO: 7 1 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 449 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

Met ser His Ser Ala Ser Pro Lys Pro Ala Thr Ala Arg Arg Ser Glu 
15 10 15 

Ala Leu Thr Gly Glu He Arg He Pro Gly Asp Lys Ser He Ser His 
20 25 30 

Arg Ser Phe Met Phe Gly Gly Leu Ala Ser Gly Glu Thr Arg He Thr 
35 40 45 
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Cly Leu Leu Glu Gly Clu Aep Val lie Aon Thr Gly Arg Ala Met Gin 
50 55 60 

Ala Met Gly Ala Lys He Arg Lys Glu Gly Asp Val Trp He He Asn 
65 70 75 80 

Gly Val Gly Aon Gly Cya Leu Leu Gin Pro Glu Ala Ala Leu Asp Phe 

85 «0 ^5 

Gly Asn Ala Gly Thr Gly Ala Arg Leu Thr Met Gly Leu Val Gly Thr 
100 105 110 

Tyr Asp Met Lys Thr Ser Phe He Gly Asp Ala Ser Leu Ser Lys Arg 
115 120 125 

Pro Met Gly Arg Val Leu Asn Pro Leu Arg Glu Met Gly Val Gin Val 
130 135 140 

Glu Ala Ala Asp Gly Asp Arg Met Pro Leu Thr Leu He Gly Pro Lys 
145 150 155 160 

Thr Ala Asn Pro He Thr Tyr Arg Val Pro Met Ala Ser Ala Gin Val 
165 170 175 

Lys Ser Ala Val Leu Leu Ala Gly Leu Asn Thr Pro Gly Val Thr Thr 
ISO 185 190 

Val He Glu Pro Val Met Thr Arg Asp His Thr Glu Lys Met Leu Gin 
195 200 205 

Gly Phe Gly Ala Asp Leu Thr Val Glu Thr Asp Lys Asp Gly Val Arg 
210 215 220 

His He Arg He Thr Gly Gin Gly Lys Leu Val Gly Gin Thr He Asp 
225 230 235 240 

Val Pro Gly Asp Pro Ser Ser Thr Ala Phe Pro Leu Val Ala Ala Leu 
245 250 255 

Leu Val Glu Gly Ser Asp Val Thr He Arg Asn Val Leu Met Asn Pro 
260 265 270 

Thr Arg Thr Gly Leu He Leu Thr Leu Gin Glu Met Gly Ala Asp He 
275 280 285 

Glu Val Leu Asn Ala Arg Leu Ala Gly Gly Glu Asp Val Ala Asp Leu 
290 295 300 

Arg Val Arg Ala Ser Lys Leu Lys Cly Val Val Val Pro Pro Clu Arg 
305 310 315 320 

Ala Pro Ser Met He Asp Glu Tyr Pro Val Leu Ala He Ala Ala Ser 
325 330 335 

Phe Ala Glu Gly Clu Thr Val Met Asp Cly Leu Asp Glu Leu Arg Val 
340 345 350 
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Lys Glu Sar Asp Arg I*ou Ala Ala Val Ala Arg Cly Leu Glu Ala Asn 
355 360 365 

Gly Val Asp Cys Thr Glu Gly Glu Met Ser Leu Thr Val Arg Gly Arg 
370 375 380 

Pro Asp Gly Lys Gly Leu Gly Gly Gly Thr Val Ala Thr His Leu Asp 
385 390 395 400 

His Arg He Ala Met Ser Phe Leu Val Met Gly Leu Ala Ala Glu Lye 
405 410 415 

Pro Val Thr Val Asp Asp Ser Asn Met He Ala Thr Ser Phe Pro Glu 
420 425 430 

Phe Met Asp Met Met Pro. Gly Leu Gly Ala Lys He Glu Leu Ser He 
435 440 445 

Leu 



(2) INFORMATION PGR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTHS 423 amino acids 

(B) TYPE: amino acid 

(C) STRANDEONESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Ser Leu Thr Leu Gin Pro He Ala Arg Val Asp Gly Thr He Asn Leu 
15 10 15 

Pro Gly Ser Lys Thr Val Ser Asn Arg Ala Leu Leu Leu Ala Ala Leu 
20 25 30 

Ala His Gly Lys Thr Val Leu Thr Asn Leu Leu Asp Ser Asp Asp Val 
35 40 45 

Arg His Met Leu Asn Ala Leu Thr Ala Leu Gly Val Ser Tyr Thr Leu 
50 55 60 

Ser Ala Asp Arg Thr Arg Cys Glu He He Gly Asn Gly Gly Pro Leu 
65 70 75 80 

His Ala Glu Gly Ala Leu Glu Leu Phe Leu Gly Asn Ala Gly Thr Ala 
85 90 95 



Met Arg Pro Leu Ala Ala Ala Leu Cys Leu Gly Ser Asn Asp He Val 
100 105 110 
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Leu Thr Cly Glu Pro Arg Met Lys Glu Arg Pro He Cly Hie Leu Val 
115 120 125 

Asp Ala Leu Arg Leu Gly Gly Ala Lys He Thr Tyr Leu Glu Gin Glu 
130 135 140 

Asn Tyr Pro Pro Leu Arg Leu Gin Gly Gly Phe Thr Gly Gly Asn Val 
145 150 155 160 

Asp Val Asp Gly Ser Val Ser Ser Gin Phe Leu Thr Ala Leu Leu Met 
165 170 175 

Thr Ala Pro Leu Ala Pro Glu Asp Thr Val He Arg He Lys Gly Asp 
180 185 190 

Leu Val Ser Lys Pro Tyr He Asp He Thr Leu Asn Leu Met Lys Thr 
195 200 205 

Phe Gly Val Glu He Glu Asn Gin His Tyr Gin Gin Phe Val Val Lys 
210 215 220 

Gly Gly Gin Ser Tyr Gin Ser Pro Gly Thr Tyr Leu Val Glu Gly Asp 
225 230 235 240 

Ala Ser Ser Ala Ser Tyr Phe Leu Ala Ala Ala Ala He Lys Gly Gly 
245 250 255 

Thr Val Lys Val Thr Cly He Gly Arg Asn Ser Met Gin Gly Asp He 
260 265 270 

Arg Phe Ala Asp Val Leu Glu Lys Met Gly Ala Thr He Cys Trp Gly 
275 280 285 

Asp Asp Tyr He Ser Cys Thr Arg Gly Glu Leu Asn Ala He Asp Met 
290 295 300 

Asp Met Asn His He Pro Asp Ala Ala Met Thr He Ala Thr Ala Ala 
305 310 315 320 

Leu Phe Ala Lys Gly Thr Thr Arg Leu Arg Asn He Tyr Asn Trp Arg 
325 330 335 

Val Lys Glu Thr Asp Arg Leu Phe Ala Met Ala Thr Glu Leu Arg Lys 
340 345 350 

Val Gly Ala Glu Val Glu Glu Gly His Asp Tyr He Arg He Thr Pro 
355 360 365 

Pro Glu Lys Leu Asn Phe Ala Glu He Ala Thr Tyr Asn Asp His Arg 
370 375 380 

Met Ala Met Cys Phe Ser Leu Val Ala Leu Ser Asp Thr Pro Val Thr 
385 390 395 400 

He Leu Asp Pro Lys Cys Thr Ala Lys Thr Phe Pro Asp Tyr Phe Glu 
405 410 415 
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Gln Leu Ala Arg He Ser Gin 
420 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1377 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 



CCATGGCTCA 


CGGTGCAAGC 


AGCCGTCCAG 


CAACTGCTCG 


TAAGTCCTCT 


GGTCTTTCTG 


60 


GAACCGTCCG 


TATTCCAGGT 


GACAAGTCTA 


TCTCCCACAG 


GTCCTTCATG 


TTTGGAGGTC 


120 


TCGCTAGCGG 


TGAAACT06T 


ATCACCGGTC 


TTTTGGAAGG 


TGAAGATGTT 


ATCAACACTG 


180 


GTAAGGCTAT 


GCAAGCTATG 


GGTGCCAGAA 


TCCGTAAGGA 


AGGTGATACT 


TGGATCATTG 


240 


ATGGTGTTGG 


TAACGGTGGA 


CTCCTTGCTC 


CTGAGGCTCC 


TCTCGATTTC 


GGTAACGCTG 


300 


CAACTGGTTG 


CCGTTTGACT 


ATGGGTCTTG 


TTGGTGTTTA 


CGATTTCGAT 


AGCACTTTCA 


360 


TTGGTGACGC 


TTCTCTCACT 


AAGCGTCCAA 


TGGGTCGTGT 


GTTGAACCCA 


CTTCGCGAAA 


420 






GAAGACGGTG 


ATCGTCTTCC 


AGTTACCTTG 


CGTGGACCAA 


480 


AGACTCCAAC 


GCCAATCACC 


TACAGGGTAC 


CTATGGCTTC 


CGCTCAAGTG 


AAGTCCGCTG 


540 


TTCTGCTTGC 


TGGTCTCAAC 


ACCCCAGGTA 


TCACCACTGT 


TATCGAGCCA 


ATCATGACTC 


600 


GTGACCACAC 


TGAAAAGATG 


CTTCAAGGTT 


TTGGTGCTAA 


CCTTACCGTT 


GAGACTGATG 


660 


CTGACGGTGT 


GCGTACCATC 


CGTCTTGAAG 


GTCGTGGTAA 


GCTCACCGGT 


CAAGTGATTG 


720 


ATGTTCCAGG 


TGATCCATCC 


TCTACTGCTT 


TCCCATTGGT 


TGCTGCCTTG 


CTTGTTCCAG 


780 


GTTCCGACGT 


CACCATCCTT 


AACGTTTTGA 


TGAACCCAAC 


CCGTACTGGT 


CTCATCTTGA 


840 


CTCTGCAGGA 


AATGGGTGCC 


GACATCGAAG 


TGATCAACCC 


ACGTCTTGCT 


GGTGGAGAAG 


900 


ACGTGGCTGA 


CTTGC6TGTT 


CGTTCTTCTA 


CTTTGAAGGG 


TGTTACTGTT 


CCAGAAGACC 


960 


GTGCTCCTTC 


TATGATCGAC 


GAGTATCCAA 


TTCTCGCTGT 


TGCAGCTGCA 


TTCGCTGAAG 


1020 


GTGCTACCGT 


TATGAACGGT 


TTGGAAGAAC 


TCCGTGTTAA 


GGAAAGCGAC 


CGTCTTTCTG 


1080 


CTGTCGCAAA 


CGGTCTCAAG 


CTCAACGGTG 


TTGATTGCGA 


TGAAGGTGAG 


ACTTCTCTCG 


1140 


TCGTGCGTGG 


TCGTCCTGAC 


GGTAAGGGTC 


TCGGTAACGC 


TTCTGGAGCA 


GCTGTCGCTA 


1200 
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CCCACCTCGA TCACCGTATC GCTATGAGCT TCCTCGTTAT GGGTCTCGTT TCTGAAAACC 1260 
CTGTTACTGT TGATGATGCT ACTATGATCG CTACTAGCTT CCCAGAGTTC ATGGATTTGA 1320 
TGGCTGGTCT TGGAGCTAAG ATCGAACTCT CCGACACTAA GGCTGCTTGA TGAGCTC 1377 
(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 318 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 
<B) LOCATION: 87 ♦.SI? 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
AGATCTATCG ATAAGCTTGA TGTAATTGGA GGAAGATCAA AATTTTCAAT CCCCATTCTT 60 

CGATTGCTTC AATTGAAGTT TCTCC6 ATC GCG CAA GTT AGC AGA ATC TGC AAT 113 

Met Ala Gin Val Ser Arg He Cys Asn 
1 5 

GGT GTG CAG AAC CCA TCT CTT ATC TCC AAT CTC TCG AAA TCC AGT CAA 161 
Gly Val Gin Asn Pro Ser Leu He Ser Asn Leu Ser Lys Ser Ser Gin 
lb 15 20 25 



CGC AAA TCT CCC TTA TCG GTT TCT CTG AAG ACG CAG CAG CAT CCA CGA 
Arg Lys Ser Pro Leu Ser Val Ser Leu Lys Thr Gin Gin His Pro Arg 

30 35 40 



ACG GCG TGC ATG C 
Thr Ala Cys Met 
75 



209 



OCT TAT CCG ATT TCG TCG TCG T6G GGA TTO AAG AAG AGT GGG ATG ACG 257 

Ala Tyr Pro He Ser Ser Ser Trp Gly Leu Lye Lys Ser Gly Met Thr 
45 50 55 

TTA ATT GGC TCT GAG CTT CGT OCT CTT AAG GTC ATG TCT TCT GTT TCC 305 

Leu He Gly Ser Glu Leu Arg Pro Leu Lys Val Met Ser Ser Val Ser 
60 55 70 



318 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 77 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(11) MOr.ECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTIONS SEQ ID NO: lis 

Met Ala Gin Val Ser Arg lie Cys Asn Gly Val Gin Asn Pro Ser Leu 
15 10 15 

lie Ser Asn Leu Ser Lys Ser Ser Gin Arg Lys Ser Pro Leu Ser Val 
20 25 30 

Ser Leu Lys Thr Gin Gin His Pro Arg Ala Tyr Pro lie Ser Ser Ser 
35 40 45 

Trp Gly Leu Lys Lys Ser Gly Met Thr Leu lie Gly Ser Glu Leu Arg 
50 55 60 

Pro Leu Lys Val Met Ser Ser Val Ser Thr Ala Cys Met 
65 70 75 

(2). INFORMATION FOR SEQ ID NO; 12s 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 402 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATIONS 87.. 401 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

AGATCTATCG ATAAGCTTGA TGTAATTGGA GGAAGATCAA AATTTTCAAT CCCCATTCTT 60 

CGATTGCTTC AATTGAAGTT TCTCCG ATG GCG CAA GTT AGC AGA ATC TGC AAT 113 

Met Ala Gin Val Ser Arg lie Cys Asn 
1 5 

GGT GTG CAG AAC CCA TCT CTT ATC TCC AAT CTC TCG AAA TCC AGT CAA 161 
Gly Val Gin Asn Pro Ser Leu lie Ser Asn Leu Ser Lys Ser Ser Gin 
10 15 20 25 

CGC AAA TCT CCC TTA TCG GTT TCT CTG AAG ACG CAG CAG CAT CCA CGA 209 
Arg Lys Ser Pro Leu Ser Val Ser Leu Lys Thr Gin Gin His Pro Arg 

30 35 40 



GOT TAT GCG ATT TCG TCG TCG TGG GGA TTG AAG AAG AGT GGG ATG ACG 
Ala Tyr Pro lie Ser Ser Ser Trp Gly Leu Lys Lys Ser Gly Met Thr 
45 50 55 . 



257 
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Met Ala Gin Val Ser Arg He Cye Asn Cly Val Gin Asn Pro Ser Leu 
15 10 15 

He Ser Asn Leu Ser Lys Ser Ser Gin Arg Lys Ser Pro Leu Ser Val 

25 30 



20 



ser Leu Lys Thr. Gin Gin His Pro Arg Ala Tyr Pro He Ser Ser Ser 
35 40 *5 

Trp Cly Leu Lys Lys Ser Gly Met Thr Leu He Gly Ser Glu Leu Arg 
50 55 60 

Pro Leu Lys Val Met Ser Ser Val Ser Thr Ala Glu Lys Ala Ser Glu 
65 70 75 80 

He val Leu Gin Pro He Arg Glu He Ser Gly Leu He Lys Leu Pro 

90 95 



TTA ATT GGC TCT GAG CTT CGT OCT CTT AAG GTC ATG TCT TCT GTT TCC 305 
Leu He Gly Ser Glu Leu Arg Pro Leu Lys Val Met Ser Ser Val Ser 
60 55 70 

ACG GCG GAG AAA GC6 TCG GAG ATT CTA CTT CAA CCC ATT AGA GAA ATC 353 
Thr Ala Glu Lys Ala Ser Glu He Val Leu Gin Pro He Arg Glu He 
75 80 85 

TCC GGT CTT ATT AAG TTG CCT GGC TCC AAG TCT CTA TCA AAT AGA ATT 401 
Ser Gly Leu He Lys Leu Pro Gly Ser Lys Ser Leu Ser Asn Arg He 
90 95 100 105 

402 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 105 amino acids 

(B) TYPE: amino acid 
(D) TOPOtXXSY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 



85 



Gly Ser Lys Ser Leu Ser Asn Arg He 
100 105 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 233 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(ix) FEATURE: 

(A> NAME/KEYS CDS 

(B) LOCATION: 14. .232 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14; 

AGATCTTTCA AGA ATG GCA CAA ATT AAC AAC ATG GCT CAA GGG ATA CAA A 
Met Ala Gin He Asn Asn Met Ala Gin Gly He Gin 
1 5 10 

ACC CTT AAT CCC AAT TCC AAT TTC CAT AAA CCC CAA GTT OCT AAA TCT S 
Thr Leu Asn Pro Asn Ser Asn Phe His Lys Pro Gin Val Pro Lys Ser 
15 20 25 



TCA AGT TTT CTT GTT TTT GGA TCT AAA AAA CTG AAA AAT TCA GCA AAT 1^ 
Ser Ser Phe Leu Val Phe Gly Ser Lys Lys Leu Lys Asn Ser Ala Asn 
30 35 40 

TCT ATG TTG GTT TTG AAA AAA GAT TCA ATT TTT ATG CAA AAG TTT TGT 1^ 
Ser Met Leu Val Leu Lys Lys Asp Ser He Phe Met Gin Lys Phe Cys 
45 50 55 60 

TCC TTT AGG ATT TCA GCA TCA GTG GCT ACA GCC TCC ATG C 2. 
Ser Phe Arg He Ser Ala Ser Val Ala Thr Ala Cys Met 
65 70 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 73 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Met Ala Gin He Asn Asn Met Ala Gin Gly He Gin Thr Leu Asn Pro 
15 10 15 

Asn Ser Asn Phe His Lys Pro Gin Val Pro Lys Ser Ser Ser Phe Leu 
20 25 30 

Val Phe Gly Ser Lys Lys Leu Lys Asn Ser Ala Asn Ser Met Leu Val 
35 40 45 

Leu Lys Lys Asp Ser He Phe Met Gin Lys Phe Cys Ser Phe Arg He 
50 55 60 

Ser Ala Ser Val Ala Thr Ala Cys Met 
65 70 



(2) INFORMATION FOR SEQ ID NO: 16: 
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(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 352 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 49.. 351 



(xi) SEQUENCE DESCRIPTIONS SEQ ID NO: 16s 

AGATCTGCTA GAAATAATTT TGTTTAACTT TAAGAAGGAG ATATATCC ATG GCA CAA 57 

Met Ala Gin 
1 

ATT AAC AAC ATG GOT CAA GGG ATA CAA ACC CTT AAT CCC AAT TCC AAT 105 
He A9n Asn Met: Ala Gin Gly He Gin Thr Leu Asn Pro Asn Ser Asn 
5 10 15 

TTC CAT AAA CCC CAA GTT OCT AAA TCT TCA AGT TTT CTT GTT TTT GGA 153 
Phe His Lys Pro Gin Val Pro Lys Ser Ser Ser Phe Leu Val Phe Gly 
20 25 30 35 

TCT AAA AAA CTG AAA AAT TCA GCA AAT TCT ATG TTG GTT TTG AAA AAA 201 
Ser Lys Lys Leu Lys Asn Ser Ala Asn Ser Met Leu Val Leu Lys Lys 
40 45 50 

GAT TCA ATT TTT ATG CAA AAG TTT TGT TCC TTT AGG ATT TCA GCA TCA 249 
Asp Ser He Phe Met Gin Lys Phe Cys Ser Phe Arg He Ser Ala Ser 
55 60 65 

GTG GCT ACA GCA CAG AAG CCT TCT GAG ATA GTG TTG CAA CCC ATT AAA 297 
Val Ala Thr Ala Gin Lys Pro Ser Glu He Val Leu Gin Pro He Lys 
70 75 80 

GAG ATT TCA GGC ACT GTT AAA TTG CCT GGC TCT AAA TCA TTA TCT AAT 345 
Glu He Ser Gly Thr Val Lys Leu Pro Gly Ser Lys Ser Leu Ser Asn 
85 90 95 



AGA ATT C 
Arg He 
100 



352 



(2) INFORMATION FOR SEQ ID NO:17s 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 101 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 



Met Ala Gin lie Asn Asn Met Ala Gin Gly lie Gin Thr Leu Asn Pro 
15 10 15 

Asn Ser Aan Phe His Lys Pro Gin Val Pro Lys Ser Ser Ser Phe Leu 
20 25 30 

Val Phe Gly Ser Lys Lys Leu Lys Asn Ser Ala Asn Ser Met Leu Val 
35 40 45 

Leu Lys Lys Asp Ser lie Phe Met Gin Lys Phe Cys Ser Phe Arg lie 
50 55 60 

Ser Ala Ser Val Ala Thr Ala Gin Lys Pro Ser Glu lie Val Leu Gin 
65 70 75 80 

Pro lie Lys Glu lie Ser Gly Thr Val Lys Leu Pro Gly Ser Lys Ser 
85 90 95 

Leu Ser Asn Arg lie 
100 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

Xaa His Gly Ala Ser Ser Arg Pro Ala Thr Ala Arg Lys Ser Ser Gly 
1 5 . 10 15 

Leu Xaa Gly Thr Val Arg lie Pro Gly Asp Lys Met 
20 25 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
Ala Pro Ser Met lie Asp Glu Tyr Pro He Leu Ala 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 ine ar 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
He Thr Gly Leu Leu Glu Gly Glu Asp Val He Asn Thr Gly Lys 



(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21s 
ATGATHGAYG ARTAYCC 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 
GARGAYGTNA THAACAC 

(2) INFORMATION FOR SEQ ID NO: 23: 



17 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOMGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 
GARGAYGTNA THAATAC 

(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 38 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 
CGTGGATAGA TCTAGGAAGA CAACCATGGC TCACGGTC 
(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

GGATAGATTA AGGAAGACGC GCATGCTTCA CGGTGCAAGC AGCC 44 

(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 35 base pairs 
(6) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY s linear 



(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 
CGCTGCCTGA TGAGCTCCAC AATCCCCATC GATGG 
(2) INFORMATION FOR SEQ ID NO:27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO:27: 
CGTCGCTCGT CGTGCGTGGC CGCCCTGACG GC 
(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28 
CGGGCAAGGC CATGCAGGCT ATGGGCGCC 
(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29 
CGGGCTGCCG CCTGACTATG GGCCTCGTCG G 
(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

Xaa His Ser Ala Ser Pro Lys Pro Ala Thr Ala Arg Arg Ser Glu 
15 10 15 

(2) INFORMATION FOR SEQ ID NO; 31: 

(i) SEQUENCE CHARACTERISTICS: 
{A> LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESSs single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
GCGGTB6CSG 6YTTSGG 

(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

Pro Gly Asp Lys Ser He Ser His Arg Ser Phe Met Phe Gly Gly Leu 
1 5 10 15 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

Leu ABP Phe Gly Asn Ala Ala Thr Gly Cys Arg Leu Thr 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANOEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 
CGGCAATGCC GCCACCGGCG CGCGCC 
(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 49 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 
GGACGGCTGC TTGCACCGTG AAGCATGCTT AA6CTTGGCG TAATCATGG 
(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D ) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 
GGAAGACGCC CAGAATTCAC GGTGCAAGCA GCCGG 
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1. An isolated DNA sequence encoding an EPSPS 
enzyme having a Km for phosphoenolpyruvate (PEP) between 1-150 
5 nM and a IQ(glyphosate)/Kni(PEP) ratio between 3-500, which DNA 
sequence is capable of hybridizing to a DNA probe &om a sequence 
selected from the group consisting of SEQ ID NO:2, SEQ ID NO:4, 
and SEQ ID NO:6. 

^ 2. A DNA molecule of claim 1 wherein said Km for 

phosphoenolpyruvate is between 2-25 ^M. 

3. A DNA molecule of claim 1 wherein said Kj/Km 
ratio is between 6-250. 

io 

4. An isolated DNA sequence encoding a protein 
which exhibits EPSPS activity wherein said protein is capable of 
reacting with antibodies raised against a Class 11 EPSPS enzyme. 

20 

5. The DNA sequence of Claim 4 wherein said 
protein is capable of reacting with antibodies raised against a 
Class n EPSPS enzyme selected from the group consisting of SEQ 
ID NO:3, SEQ ID N0:5, and SEQ ID NO:7. 

25 

6. The DNA sequence of Claim 5 wherein said 
antibodies are raised against a Class II EPSPS enzyme of SEQ ID 
NO:3. 



7. A recombinant^ double-stranded DNA molecule 
comprising in sequence: 
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a) a promoter which functions in plant cells to cause 
the production of an KNA sequence; 

b) a structural DNA sequence that causes the 
production of an RNA sequence which encodes a 

5 Class n EPSPS enzyme; and 

c) a 3* non-translated region which functions in 
plant cells to cause the addition of a stretch of 
polyadenyl nucleotides to the 3' end of the RNA 
sequence 

10 where the promoter is heterologous with respect to the structural 
DNA sequence and adapted to cause sufficient expression of the 
fusion polypeptide to enhance the glyphosate tolerance of a plant 
cell transformed with said DNA molecule. 

15 8. The DNA molecule of Claim 7 in which said 

structural DNA sequence encodes a fusion polsrpeptide comprising 
an amino-terminal chloroplast transit peptide and a Class II 
EPSPS enzyme. 

20 9. The DNA molecule of Claim 8 wherein said 

structural DNA sequence encoding a Class 11 EPSPS enzyme is 
selected from the group consisting of SEQ ID NO:2, SEQ ID NO:4 
and SEQ ID NO:6. 

25 10. The DNA molecule of Claim 9 wherein said 

sequence is from SEQ ID NO:2. 

11. A DNA molecule of Claim 8 in which the 
promoter is a plant DNA virus promoter. 



30 
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12. A DNA molecule of Claim 11 in which the 
promoter is selected from the group consisting of CaMV35S and 
FMV35S promoters. 

* 

5 13. A method of producing genetically transformed 

plants which are tolerant toward glyphosate herbicide, comprising 
the steps of: 

a) inserting into the genome of a plant cell a 
recombinant, double-stranded DNA molecule 

ID comprising: 

i) a promoter which functions in plant cells to cause 
the production of an RNA sequence, 

ii) a structural DNA sequence that causes the 
production of an RNA sequence which encodes a 

15 fusion polypeptide comprising an amino terminal 

chloroplast transit peptide and a Class 11 EPSPS 
enzyme, 

iii) a 3' non-translated DNA sequence which 
functions in plant cells to cause the addition of a 

2D stretch of polyadenyl nucleotides to the 3' end of 

the UNA sequence 
where the promoter is heterologous with respect to the structural 
DNA sequence and adapted to cause su£&cient expression of the 
fusion polypeptide to enhance the glyphosate tolerance of a plant 

25 cell transformed with said gene; 

b) obtaining a transformed plant cell; and 

c) regenerating from the transformed plant cell a 
genetically transformed plant which has 
increased tolerance to glyphosate herbicide. 



30 
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14. The method of Claim 13 wherein said structural 
DNA sequence encoding a Class II EPSPS enzyme is selected from 
the group consisting of SEQ ID N0:2, SEQ ID NO:4, and SEQ ID 
N0:6. 

5 

15. The DNA molecule of Claim 14 wherein said 
sequence is that as set forth in SEQ ID NO:2. 

16. A method of Claim 13 in which the promoter is 
ID from a plant DNA virus. 

17. A method of Claim 16 in which the promoter is 
selected from the group consisting of CaMV35S and FMV35S 
promoters. 

15 

18. A glyphosate tolerant plant cell comprising a 
DNA molecule of Claims 8, 9 or 12. 

19. A glyphosate tolerant plant cell of Claim 18 in 
20 which the promoter is a plant DNA virus promoter. 

20. A glyphosate tolerant plant cell of Claim 19 in 
which the promoter is selected from the group consisting of 
CaMV35S and PMV35S promoters. 

25 

21. A glyphosate tolerant plant cell of Claim 18 
selected from the group consisting of com, wheat, rice, soybean, 
cotton, sugarbeet, oilseed rape, canola, flax, sunflower, potato, 
tobacco, tomato, alfrdfa, poplar, pine, apple and grape. 
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22. A glyphosate tolerant plant comprising plant cells 

of Claim 18. 

23. A glyphosate tolerant plant of Claim 22 in which 
the promoter is &om a DNA plant virus promoter. 

24. A glyphosate tolerant plant of Claim 23 in which 
the promoter is selected from the group consisting of CaMV35S 

; and PMV35S promoters. 

25. A glyphosate- tolerant plant of Claim 22 selected 
from the group consisting of com, wheat, rice, soybean, cotton, 
sugarbeet, oilseed rape, canola, flax, sunflower, potato, tobacco, 
tomato, alfalfa, poplar, pine, apple and grape. 

26. A method for selectively controlling weeds in a 
field containing a crop having planted crop seeds or plants 
comprising the steps of: 

a) planting said crop seeds or plants which are 
glyphosate tolerant as a result of a recombinant 
double-stranded DNA molecule being inserted 
into said crop seed or plant, said DNA molecule 
having: 

i) a promoter which functions in plant cells to cause 
the production of an KNA sequence, 

ii) a structural DNA sequence that causes the 
production of an RNA sequence which encodes a 
polypeptide which comprises an amino terminal 
chloroplast transit peptide and a Class 11 EPSPS 
enzyme. 
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iii) a 3' non-translated DNA sequence which 
functions in plant cells to cause the addition of a 
stretch of polyadenyl nucleotides to the 3' end of 
the RNA sequence 
5 where the promoter is heterologous with respect to the structural 
DNA sequence and adapted to cause suflSdent expression of the 
fusion polypeptide to enhance the glyphosate tolerance of a plant 
cell transformed with said gene; and 

b) applying to said crop and weeds in said field ia 
10 sufficient amount of glyphosate herbicide to 

control said weeds without significantly affecting 
said crop. 

27. The method of Claim 26 wherein said structtural 
15 DNA sequence encoding a Class 11 EPSPS enzyme is selected from 

the sequences as set forth in SEQ ID NO:2, SEQ ID NO:4 or SEQ ID 
NO:6. 

28. A method of Claim 27 in which said DNA 
20 molecule contains a structural DNA sequence from SEQ ID NO:2. 

29. A method of Claim 28 in which said DNA 
molectJe further comprises a promoter selected from the group 
consisting of the CAMV35SS and FMV35S promoters. 

25 

30. A method of Claim 29 in which the crop plant is 
selected from the group consisting of com, wheat, rice, soybean, 
cotton, sugarbeet, oilseed rape, canola, flax, sunflower, potato, 
tobacco, tomato, alfalfa, poplar, pine, apple and grape. 

30 
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FIG. 2 
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